Activity log for bug #1704310

Date Who What changed Old value New value Message
2017-07-14 07:01:38 Alice Liu bug added bug
2017-09-22 02:18:15 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]-- while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.14 Target Release: 18.04 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.15 Target Release: 18.04
2018-05-01 08:42:17 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.15 Target Release: 18.04 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 18.10
2018-05-01 08:42:29 quanxian tags intel-kernel-18.10
2018-06-11 03:03:47 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 18.10 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.19 Target Release: 18.10
2018-07-25 16:06:36 Joseph Salisbury tags intel-kernel-18.10 intel-kernel-18.10 kernel-da-key
2018-07-25 16:07:21 Joseph Salisbury intel: status New Triaged
2018-07-25 16:07:24 Joseph Salisbury intel: importance Undecided Medium
2018-07-25 16:07:34 Joseph Salisbury intel: assignee Canonical Kernel Team (canonical-kernel-team)
2018-10-26 02:42:02 quanxian information type Proprietary Public
2018-10-26 02:42:12 quanxian bug task added ubuntu
2018-10-26 02:42:57 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.19 Target Release: 18.10 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.19 Target Release: 19.04
2018-10-26 02:43:11 quanxian tags intel-kernel-18.10 kernel-da-key intel-kernel-19.04 kernel-da-key
2019-02-27 04:15:09 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:4.19 Target Release: 19.04 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 19.04
2019-05-06 08:22:43 quanxian affects ubuntu linux (Ubuntu)
2019-05-06 08:23:04 quanxian tags intel-kernel-19.04 kernel-da-key intel-kernel-19.10 kernel-da-key
2019-05-06 08:23:15 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 19.04 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 19.10
2019-05-06 08:30:07 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-05-06 08:30:08 Ubuntu Kernel Bot tags intel-kernel-19.10 kernel-da-key eoan intel-kernel-19.10 kernel-da-key
2019-07-30 01:02:54 quanxian intel: status Triaged Fix Committed
2019-07-30 01:03:52 quanxian description Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:TBD Target Release: 19.10 Description: If I run the following script in a loop, and then ctrl-c it, the kernel hits a BUG in the device unregister path. --[ns-reconf.sh]-- function pmem_btt_dax_switch() { sector_size_list="512 520 528 4096 4104 4160 4224" for sector_size in $sector_size_list; do ndctl create-namespace -f -e namespace$ {1}.0 --mode=sector -l $sector_size ndctl create-namespace -f -e namespace${1} .0 --mode=raw ndctl create-namespace -f -e namespace$ {1} .0 --mode=dax done } for i in 0 1 2 3; do pmem_btt_dax_switch $i & done --[ns-reconf.sh]--     while true; do ./ns-reconfig.sh 0; ./ns-reconfig.sh 1; done I've tried three times and hit the bug every time, so it seems readily reproducible. Offset 0x20 is the put function pointer in struct klist. This is where the null pointer is triggered: static void klist_put(struct klist_node *n, bool kill) { struct klist *k = knode_klist; void (*put)(struct klist_node *) = k->put; <---- This is the tip of Linus' tree, commit be941bf2e6a32. Any ideas? -Jeff [ 117.728323] pmem0s: detected capacity change from 0 to 34093219840 [ 117.831496] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 117.867193] IP: klist_put+0x1b/0x70 [ 117.884172] PGD 0 [ 117.884172] P4D 0 [ 117.894325] [ 117.912779] Oops: 0000 1 SMP [ 117.926842] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd nd_pmem iTCO_wdt iTCO_vendor_support glue_helper dax_pmem hpilo device_dax lpc_ich hpwdt cryptd nd_btt pcspkr ipmi_si sg ioatdma i2c_i801 mfd_core shpchp ipmi_devintf dca wmi ipmi_msghandler nfit acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea [ 118.247253] sysfillrect sysimgblt bnx2x fb_sys_fops ahci tg3 mdio ttm libahci ptp drm libata i2c_core pps_core hpsa libcrc32c crc32c_intel scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [ 118.326324] CPU: 45 PID: 1060 Comm: kworker/u145:2 Not tainted 4.12.0-rc2+ #28 [ 118.358984] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/05/2016 [ 118.398735] Workqueue: events_unbound async_run_entry_fn [ 118.426362] task: ffff880465ea8000 task.stack: ffffc9000683c000 [ 118.452949] RIP: 0010:klist_put+0x1b/0x70 [ 118.470906] RSP: 0018:ffffc9000683fd70 EFLAGS: 00010246 [ 118.494545] RAX: ffff880cfdbe8b40 RBX: 0000000000000000 RCX: 000000018022001c [ 118.526532] RDX: 000000018022001d RSI: 0000000000000001 RDI: 0000000000000000 [ 118.558720] RBP: ffffc9000683fd90 R08: ffff88106463a8e8 R09: 000000018022001c [ 118.590772] R10: 000000006463a701 R11: ffff88106463a8e8 R12: ffff88106935cc00 [ 118.622923] R13: ffff880cfdbe8b68 R14: 0000000000000001 R15: ffff88046f438ce8 [ 118.655122] FS: 0000000000000000(0000) GS:ffff88046fac0000(0000) knlGS:0000000000000000 [ 118.691517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 118.716914] CR2: 0000000000000020 CR3: 0000000001c09000 CR4: 00000000003406e0 [ 118.749349] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 118.781627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 118.814621] Call Trace: [ 118.825740] klist_del+0xe/0x10 [ 118.839764] device_del+0x11a/0x330 [ 118.855526] device_unregister+0x1a/0x60 [ 118.873549] nd_async_device_unregister+0x22/0x30 [ 118.895091] async_run_entry_fn+0x39/0x170 [ 118.916718] process_one_work+0x149/0x360 [ 118.937472] worker_thread+0x4d/0x3c0 [ 118.953919] kthread+0x109/0x140 [ 118.968400] ? rescuer_thread+0x380/0x380 [ 118.986373] ? kthread_park+0x60/0x60 [ 119.002425] ret_from_fork+0x2c/0x40 [ 119.018421] Code: e9 1a ff ff ff 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 89 f6 41 55 49 89 fd 41 54 53 48 8b 1f 48 83 e3 fe 48 89 df <4c> 8b 63 20 e8 7c f8 00 00 45 84 f6 75 31 4c 89 ef e8 af fe ff [ 119.103298] RIP: klist_put+0x1b/0x70 RSP: ffffc9000683fd70 [ 119.128194] CR2: 0000000000000020 [ 119.143813] --[ end trace 4fadffd9ed599da8 ]-- [ 119.169828] Kernel panic - not syncing: Fatal exception [ 119.193524] Kernel Offset: disabled [ 119.213732] ---[ end Kernel panic - not syncing: Fatal exception Target Kernel:5.3 Target Release: 19.10
2019-10-23 06:35:47 quanxian intel: status Fix Committed Fix Released