Activity log for bug #1635597

Date Who What changed Old value New value Message
2016-10-21 10:49:43 bugproxy bug added bug
2016-10-21 10:49:45 bugproxy tags architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin---
2016-10-21 10:49:46 bugproxy attachment added kdump-config show output https://bugs.launchpad.net/bugs/1635597/+attachment/4764809/+files/file_146907.txt
2016-10-21 10:49:48 bugproxy attachment added /etc/default/kdump-tools https://bugs.launchpad.net/bugs/1635597/+attachment/4764810/+files/file_146907.txt
2016-10-21 10:49:50 bugproxy attachment added blkid ouput https://bugs.launchpad.net/bugs/1635597/+attachment/4764811/+files/file_146907.txt
2016-10-21 10:49:52 bugproxy attachment added kdump kernel boots fine after including necessary harware handler modules https://bugs.launchpad.net/bugs/1635597/+attachment/4764812/+files/crashkernel_including-necessary-modules.log
2016-10-21 10:49:54 bugproxy ubuntu: assignee Taco Screen team (taco-screen-team)
2016-10-21 10:49:59 bugproxy affects ubuntu linux (Ubuntu)
2016-10-27 13:38:41 Tim Gardner bug task added makedumpfile (Ubuntu)
2016-10-27 13:39:01 Tim Gardner makedumpfile (Ubuntu): assignee Louis Bouchard (louis-bouchard)
2016-11-18 10:26:19 Louis Bouchard linux (Ubuntu): status New Confirmed
2016-11-18 10:26:23 Louis Bouchard makedumpfile (Ubuntu): status New Confirmed
2016-11-18 10:26:30 Louis Bouchard makedumpfile (Ubuntu): status Confirmed Triaged
2016-11-18 10:26:38 Louis Bouchard linux (Ubuntu): status Confirmed Invalid
2016-12-21 16:09:35 bugproxy attachment added /etc/default/kdump-tools https://bugs.launchpad.net/bugs/1635597/+attachment/4794852/+files/file_146907.txt
2016-12-21 16:09:37 bugproxy attachment added blkid ouput https://bugs.launchpad.net/bugs/1635597/+attachment/4794853/+files/file_146907.txt
2016-12-21 16:09:41 bugproxy attachment added kdump kernel boots fine after including necessary harware handler modules https://bugs.launchpad.net/bugs/1635597/+attachment/4794854/+files/crashkernel_including-necessary-modules.log
2017-07-06 15:18:09 Louis Bouchard makedumpfile (Ubuntu): status Triaged In Progress
2017-07-06 15:18:22 Louis Bouchard nominated for series Ubuntu Xenial
2017-07-06 15:18:22 Louis Bouchard bug task added linux (Ubuntu Xenial)
2017-07-06 15:18:22 Louis Bouchard bug task added makedumpfile (Ubuntu Xenial)
2017-07-06 15:18:22 Louis Bouchard nominated for series Ubuntu Zesty
2017-07-06 15:18:22 Louis Bouchard bug task added linux (Ubuntu Zesty)
2017-07-06 15:18:22 Louis Bouchard bug task added makedumpfile (Ubuntu Zesty)
2017-07-06 15:18:22 Louis Bouchard nominated for series Ubuntu Trusty
2017-07-06 15:18:22 Louis Bouchard bug task added linux (Ubuntu Trusty)
2017-07-06 15:18:22 Louis Bouchard bug task added makedumpfile (Ubuntu Trusty)
2017-07-06 15:18:32 Louis Bouchard linux (Ubuntu Trusty): status New Invalid
2017-07-06 15:18:34 Louis Bouchard linux (Ubuntu Xenial): status New Invalid
2017-07-06 15:18:37 Louis Bouchard linux (Ubuntu Zesty): status New Invalid
2017-07-06 15:18:50 Louis Bouchard makedumpfile (Ubuntu Trusty): assignee Louis Bouchard (louis)
2017-07-06 15:18:52 Louis Bouchard makedumpfile (Ubuntu Xenial): assignee Louis Bouchard (louis)
2017-07-06 15:18:54 Louis Bouchard makedumpfile (Ubuntu Zesty): assignee Louis Bouchard (louis)
2017-07-06 16:00:10 Frank Heimes bug task added ubuntu-power-systems
2017-07-06 16:00:23 Frank Heimes ubuntu-power-systems: status New In Progress
2017-07-06 16:08:12 Launchpad Janitor makedumpfile (Ubuntu Trusty): status New Confirmed
2017-07-06 16:08:12 Launchpad Janitor makedumpfile (Ubuntu Xenial): status New Confirmed
2017-07-06 16:08:12 Launchpad Janitor makedumpfile (Ubuntu Zesty): status New Confirmed
2017-07-19 16:41:33 Manoj Iyer ubuntu-power-systems: importance Undecided High
2017-07-19 16:41:35 Manoj Iyer linux (Ubuntu): importance Undecided High
2017-07-19 16:41:37 Manoj Iyer linux (Ubuntu Trusty): importance Undecided High
2017-07-19 16:41:39 Manoj Iyer linux (Ubuntu Xenial): importance Undecided High
2017-07-19 16:41:41 Manoj Iyer linux (Ubuntu Zesty): importance Undecided High
2017-07-19 16:41:43 Manoj Iyer makedumpfile (Ubuntu): importance Undecided High
2017-07-19 16:41:45 Manoj Iyer makedumpfile (Ubuntu Trusty): importance Undecided High
2017-07-19 16:41:47 Manoj Iyer makedumpfile (Ubuntu Xenial): importance Undecided High
2017-07-19 16:41:49 Manoj Iyer makedumpfile (Ubuntu Zesty): importance Undecided High
2017-07-19 16:42:05 Manoj Iyer linux (Ubuntu): assignee Taco Screen team (taco-screen-team) Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
2017-07-19 16:42:25 Manoj Iyer ubuntu-power-systems: assignee Canonical Kernel Team (canonical-kernel-team)
2017-08-07 13:50:49 Manoj Iyer tags architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-a
2017-08-14 13:59:58 Manoj Iyer makedumpfile (Ubuntu): assignee Louis Bouchard (louis) David Britton (davidpbritton)
2017-08-14 14:00:13 Manoj Iyer makedumpfile (Ubuntu Trusty): assignee Louis Bouchard (louis) David Britton (davidpbritton)
2017-08-14 14:00:24 Manoj Iyer makedumpfile (Ubuntu Xenial): assignee Louis Bouchard (louis) David Britton (davidpbritton)
2017-08-14 14:00:35 Manoj Iyer makedumpfile (Ubuntu Zesty): assignee Louis Bouchard (louis) David Britton (davidpbritton)
2017-08-18 00:00:52 Launchpad Janitor makedumpfile (Ubuntu): status In Progress Fix Released
2017-08-22 15:30:50 Andrew Cloke linux (Ubuntu): status Invalid New
2017-08-22 15:30:54 Andrew Cloke linux (Ubuntu Trusty): status Invalid New
2017-08-22 15:30:57 Andrew Cloke linux (Ubuntu Xenial): status Invalid New
2017-08-22 15:31:00 Andrew Cloke linux (Ubuntu Zesty): status Invalid New
2017-08-23 12:20:22 Manoj Iyer makedumpfile (Ubuntu): assignee David Britton (davidpbritton) Canonical Kernel Team (canonical-kernel-team)
2017-08-23 12:21:03 Manoj Iyer linux (Ubuntu): assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Canonical Kernel Team (canonical-kernel-team)
2017-08-23 13:56:13 Manoj Iyer makedumpfile (Ubuntu Trusty): assignee David Britton (davidpbritton) Canonical Kernel Team (canonical-kernel-team)
2017-08-23 13:56:24 Manoj Iyer makedumpfile (Ubuntu Xenial): assignee David Britton (davidpbritton) Canonical Kernel Team (canonical-kernel-team)
2017-08-23 13:56:38 Manoj Iyer makedumpfile (Ubuntu Zesty): assignee David Britton (davidpbritton) Canonical Kernel Team (canonical-kernel-team)
2017-09-11 13:44:45 Andrew Cloke ubuntu-power-systems: status In Progress Incomplete
2017-09-11 13:45:25 Manoj Iyer tags architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-a architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-g
2017-09-18 14:29:32 Manoj Iyer ubuntu-power-systems: status Incomplete Confirmed
2017-09-18 14:30:05 Manoj Iyer tags architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-g architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-r
2017-09-29 13:59:53 bugproxy attachment added kdump boot log on talclp3 https://bugs.launchpad.net/bugs/1635597/+attachment/4958727/+files/kdump_validation-talclp3.log
2017-10-18 17:46:04 Thadeu Lima de Souza Cascardo makedumpfile (Ubuntu Xenial): assignee Canonical Kernel Team (canonical-kernel-team) Thadeu Lima de Souza Cascardo (cascardo)
2017-10-19 12:22:34 Thadeu Lima de Souza Cascardo description Problem Description ========================== On talclp1, I enabled kdump. But kdump failed and it drop to BusyBox. root@talclp1:~# echo c> /proc/sysrq-trigger [ 132.643690] sysrq: SysRq : Trigger a crash [ 132.643739] Unable to handle kernel paging request for data at address 0x00000000 [ 132.643745] Faulting instruction address: 0xc0000000005c28f4 [ 132.643749] Oops: Kernel access of bad area, sig: 11 [#1] [ 132.643753] SMP NR_CPUS=2048 NUMA pSeries [ 132.643758] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs rpadlpar_io rpaphp rpcsec_gss_krb5 nfsv4 dccp_diag cifs nfs dns_resolver dccp tcp_diag fscache udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xfs libcrc32c pseries_rng rng_core ghash_generic gf128mul vmx_crypto sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic btrfs xor raid6_pq dm_round_robin sr_mod sd_mod cdrom ses enclosure scsi_transport_sas ibmveth crc32c_vpmsum ipr scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath dm_mod [ 132.643819] CPU: 49 PID: 10174 Comm: bash Not tainted 4.8.0-15-generic #16-Ubuntu [ 132.643824] task: c000000111767080 task.stack: c0000000d82e0000 [ 132.643828] NIP: c0000000005c28f4 LR: c0000000005c39d8 CTR: c0000000005c28c0 [ 132.643832] REGS: c0000000d82e3990 TRAP: 0300 Not tainted (4.8.0-15-generic) [ 132.643836] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242422 XER: 00000001 [ 132.643848] CFAR: c0000000000087d0 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c0000000005c39d8 c0000000d82e3c10 c000000000f67b00 0000000000000063 GPR04: c00000011d04a9b8 c00000011d05f7e0 c00000047fb00000 0000000000015998 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c0000000005c28c0 c000000007b4b900 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 000001002b566368 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffffe87a294 0000000000000001 c000000000ebff60 0000000000000004 GPR28: c000000000ec0320 0000000000000063 c000000000e72a90 0000000000000000 [ 132.643906] NIP [c0000000005c28f4] sysrq_handle_crash+0x34/0x50 [ 132.643911] LR [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643914] Call Trace: [ 132.643917] [c0000000d82e3c10] [c000000000a245e8] 0xc000000000a245e8 (unreliable) [ 132.643923] [c0000000d82e3c30] [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643928] [c0000000d82e3cd0] [c0000000005c4188] write_sysrq_trigger+0x78/0xa0 [ 132.643935] [c0000000d82e3d00] [c0000000003ad770] proc_reg_write+0xb0/0x110 [ 132.643941] [c0000000d82e3d50] [c00000000030fc3c] __vfs_write+0x6c/0xe0 [ 132.643946] [c0000000d82e3d90] [c000000000311144] vfs_write+0xd4/0x240 [ 132.643950] [c0000000d82e3de0] [c000000000312e5c] SyS_write+0x6c/0x110 [ 132.643957] [c0000000d82e3e30] [c0000000000095e0] system_call+0x38/0x108 [ 132.643961] Instruction dump: [ 132.643963] 38425240 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 3949ba60 [ 132.643972] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 [ 132.643981] ---[ end trace eed6bbcd2c3bdfdf ]--- [ 132.646105] [ 132.646176] Sending IPI to other CPUs [ 132.647490] IPI complete I'm in purgatory -> smp_release_cpus() spinning_secondaries = 104 <- smp_release_cpus() [ 2.011346] alg: hash: Test 1 failed for crc32c-vpmsum [ 2.729254] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.731554] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.739087] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.739089] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.739110] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.739115] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 2.739122] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.739123] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.739148] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.748938] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.748939] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.748940] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.748942] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.748958] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.748963] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.748978] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.999087] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.119912] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.252513] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.343680] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.381234] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.419515] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.474587] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.482188] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.531439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.552824] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.594489] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.619222] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.672208] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.680298] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.731718] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.761333] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.794955] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.819212] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.871913] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.889439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.922620] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.960707] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.002959] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.035611] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.054476] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.092241] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.099432] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182358] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182823] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.234767] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.333309] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.402827] device-mapper: table: 254:0: multipath: error attaching hardware handler Gave up waiting for root device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Check root= (did the system wait for the right device?) - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=853769e5-1dc5-41be-a689-b430320d207f does not exist. Dropping to a shell! BusyBox v1.22.1 (Ubuntu 1:1.22.0-19ubuntu2) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) == Comment: #7 - Vaishnavi Bhat <vaish123@in.ibm.com> - 2016-10-07 05:37:53 == The blkid output does not show any device with UUID=853769e5-1dc5-41be-a689-b430320d207f which is the root device used in the kexec command line (from kdump-config show) kexec command: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.8.0-15-generic root=UUID=853769e5-1dc5-41be-a689-b430320d207f ro xmon=on splash quiet irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz Hence the kdump kernel is failing to boot here. == Comment: #11 - Xue Sheng Li <lixuesh@cn.ibm.com> - 2016-10-17 01:54:56 == recreated with -24 kernel. root@talclp1:~# echo c > /proc/sysrq-trigger [ 72.655416] sysrq: SysRq : Trigger a crash [ 72.655458] Unable to handle kernel paging request for data at address 0x00000000 [ 72.655463] Faulting instruction address: 0xc00000000069d148 [ 72.655469] Oops: Kernel access of bad area, sig: 11 [#1] [ 72.655472] SMP NR_CPUS=2048 NUMA pSeries [ 72.655477] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 nfsv4 nfs cifs fscache binfmt_misc xfs pseries_rng vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq dm_round_robin ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 72.655521] CPU: 25 PID: 9730 Comm: bash Not tainted 4.8.0-24-generic #26-Ubuntu [ 72.655525] task: c0000001d8451e00 task.stack: c0000001d8494000 [ 72.655529] NIP: c00000000069d148 LR: c00000000069e198 CTR: c00000000069d120 [ 72.655534] REGS: c0000001d84979f0 TRAP: 0300 Not tainted (4.8.0-24-generic) [ 72.655537] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001 [ 72.655549] CFAR: c000000000008750 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c00000000069e198 c0000001d8497c70 c000000001476700 0000000000000063 GPR04: c00000047e64aca0 c00000047e65fb40 c00000047df00000 0000000000015ed8 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c00000000069d120 c000000007b3e100 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 0000010036d36398 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffff5582464 0000000000000001 c00000000138e6a0 0000000000000004 GPR28: c00000000138ea60 0000000000000063 c000000001342590 0000000000000000 [ 72.655608] NIP [c00000000069d148] sysrq_handle_crash+0x28/0x30 [ 72.655613] LR [c00000000069e198] __handle_sysrq+0xe8/0x280 [ 72.655616] Call Trace: [ 72.655619] [c0000001d8497c70] [c00000000069e178] __handle_sysrq+0xc8/0x280 (unreliable) [ 72.655625] [c0000001d8497d10] [c00000000069e8ec] write_sysrq_trigger+0x6c/0x90 [ 72.655631] [c0000001d8497d40] [c0000000003a9568] proc_reg_write+0x88/0xd0 [ 72.655637] [c0000001d8497d70] [c00000000030c40c] __vfs_write+0x3c/0x70 [ 72.655642] [c0000001d8497d90] [c00000000030d674] vfs_write+0xd4/0x240 [ 72.655647] [c0000001d8497de0] [c00000000030f1c8] SyS_write+0x68/0x110 [ 72.655652] [c0000001d8497e30] [c000000000009584] system_call+0x38/0xec [ 72.655656] Instruction dump: [ 72.655658] 60000000 60000000 3c4c00de 384295e0 7c0802a6 60000000 3d22001a 3949c8e0 [ 72.655667] 39200001 912a0000 7c0004ac 39400000 <992a0000> 4e800020 3c4c00de 384295b0 [ 72.655677] ---[ end trace 43b490f085103bf5 ]--- [ 72.659366] [ 72.659429] Sending IPI to other CPUs [ 72.660740] IPI complete I'm in purgatory -> smp_release_cpus() spinning_secondaries = 104 <- smp_release_cpus() [ 1.699068] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv4 checksum offload settings. 1 rc=4 [ 1.699093] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv6 checksum offload settings. 1 rc=4 [ 1.699101] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change tso settings. 1 rc=4 [ 2.657700] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.657701] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.657781] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.660641] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.667731] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.677685] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.677688] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.677708] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.697737] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.697743] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.697744] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.697747] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.697749] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.697753] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.699340] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.699360] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 3.350794] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.471468] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.540387] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.628523] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.657731] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.733416] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.752066] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.808884] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.838148] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.919247] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.950262] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.997839] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.007810] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.082174] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.089411] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.162200] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.202441] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.252289] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.279870] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.311712] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.348150] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.402076] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.432069] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.487871] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.518282] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.573338] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.599280] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.632144] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.671142] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.713352] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.782117] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.890336] device-mapper: table: 252:0: multipath: error attaching hardware handler == Comment: #13 - Hari Krishna Bathini <hbathini@in.ibm.com> - 2016-10-19 16:26:57 == (In reply to comment #12) > Hi Hari, > > Can you please take a look at this issue and suggest what would be the next > step ? > We are facing this issue with -24 kernel as well. Can this be a issue with > kdump kernel that has missing multipath modules or some other issue ? > Hi Vaishnavi, Necessary hardware handler modules are missing in the kdump initrd. Here is the console log of kdump kernel that says the same: -- Begin: Loading multipath hardware handlers ... Failure: failed to load module scsi_dh_alua. Failure: failed to load module scsi_dh_rdac. Failure: failed to load module scsi_dh_emc. -- Including this modules explicitly and rebuilding initrd for kdump, able to get to a point where makedumpfile starts to capture dump but fails with: "get_mem_map: Can't distinguish the memory type." which is already tracked with bug 146571 Thanks Hari PS1: To explicitly add modules to kdump initrd 1. List the necessary modules in /var/lib/kdump/initramfs-tools/modules file 2. mkinitramfs -d /var/lib/kdump/initramfs-tools -o /var/lib/kdump/initrd.img-$kver 3. systemctl restart kdump-tools.service Mirroring this bug to Canonical for their inputs if to include the missing hardware modules to the kdump initrd or to proceed with the workaround. [Impact] When the target device where to dump the kernel is under a multipath configuration, dumping will fail, possibly leaving the system stuck in the kdump kernel. The fix is to include some scsi device handlers needed for the multipath setup inside the initramfs image that is used by kdump. All such modules are included. [Test Case] Setting up kdump to target a multipath device using an appropriate storage that requires such scsi_dh modules and triggering a crash will demonstrate that kdump fails. After the fix, it works fine. [Regression Potential] If a bug is introduced, loading kdump might fail, and a crash will not be generated. A worse regression that might be considered is the system is stuck in such a kdump kernel and needs to be rebooted locally (and the crash file is not generated either). But since this is what we are trying to fix, we don't expect other systems to break. One possibility is that the initramfs would be bigger and fail to load. This didn't happen on a small (less than 1GiB of RAM) x86 VM, though. Problem Description ========================== On talclp1, I enabled kdump. But kdump failed and it drop to BusyBox. root@talclp1:~# echo c> /proc/sysrq-trigger [ 132.643690] sysrq: SysRq : Trigger a crash [ 132.643739] Unable to handle kernel paging request for data at address 0x00000000 [ 132.643745] Faulting instruction address: 0xc0000000005c28f4 [ 132.643749] Oops: Kernel access of bad area, sig: 11 [#1] [ 132.643753] SMP NR_CPUS=2048 NUMA pSeries [ 132.643758] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs rpadlpar_io rpaphp rpcsec_gss_krb5 nfsv4 dccp_diag cifs nfs dns_resolver dccp tcp_diag fscache udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xfs libcrc32c pseries_rng rng_core ghash_generic gf128mul vmx_crypto sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic btrfs xor raid6_pq dm_round_robin sr_mod sd_mod cdrom ses enclosure scsi_transport_sas ibmveth crc32c_vpmsum ipr scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath dm_mod [ 132.643819] CPU: 49 PID: 10174 Comm: bash Not tainted 4.8.0-15-generic #16-Ubuntu [ 132.643824] task: c000000111767080 task.stack: c0000000d82e0000 [ 132.643828] NIP: c0000000005c28f4 LR: c0000000005c39d8 CTR: c0000000005c28c0 [ 132.643832] REGS: c0000000d82e3990 TRAP: 0300 Not tainted (4.8.0-15-generic) [ 132.643836] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242422 XER: 00000001 [ 132.643848] CFAR: c0000000000087d0 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c0000000005c39d8 c0000000d82e3c10 c000000000f67b00 0000000000000063 GPR04: c00000011d04a9b8 c00000011d05f7e0 c00000047fb00000 0000000000015998 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c0000000005c28c0 c000000007b4b900 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 000001002b566368 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffffe87a294 0000000000000001 c000000000ebff60 0000000000000004 GPR28: c000000000ec0320 0000000000000063 c000000000e72a90 0000000000000000 [ 132.643906] NIP [c0000000005c28f4] sysrq_handle_crash+0x34/0x50 [ 132.643911] LR [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643914] Call Trace: [ 132.643917] [c0000000d82e3c10] [c000000000a245e8] 0xc000000000a245e8 (unreliable) [ 132.643923] [c0000000d82e3c30] [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643928] [c0000000d82e3cd0] [c0000000005c4188] write_sysrq_trigger+0x78/0xa0 [ 132.643935] [c0000000d82e3d00] [c0000000003ad770] proc_reg_write+0xb0/0x110 [ 132.643941] [c0000000d82e3d50] [c00000000030fc3c] __vfs_write+0x6c/0xe0 [ 132.643946] [c0000000d82e3d90] [c000000000311144] vfs_write+0xd4/0x240 [ 132.643950] [c0000000d82e3de0] [c000000000312e5c] SyS_write+0x6c/0x110 [ 132.643957] [c0000000d82e3e30] [c0000000000095e0] system_call+0x38/0x108 [ 132.643961] Instruction dump: [ 132.643963] 38425240 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 3949ba60 [ 132.643972] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 [ 132.643981] ---[ end trace eed6bbcd2c3bdfdf ]--- [ 132.646105] [ 132.646176] Sending IPI to other CPUs [ 132.647490] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 2.011346] alg: hash: Test 1 failed for crc32c-vpmsum [ 2.729254] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.731554] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.739087] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.739089] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.739110] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.739115] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 2.739122] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.739123] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.739148] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.748938] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.748939] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.748940] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.748942] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.748958] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.748963] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.748978] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.999087] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.119912] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.252513] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.343680] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.381234] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.419515] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.474587] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.482188] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.531439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.552824] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.594489] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.619222] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.672208] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.680298] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.731718] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.761333] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.794955] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.819212] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.871913] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.889439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.922620] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.960707] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.002959] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.035611] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.054476] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.092241] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.099432] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182358] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182823] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.234767] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.333309] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.402827] device-mapper: table: 254:0: multipath: error attaching hardware handler Gave up waiting for root device. Common problems:  - Boot args (cat /proc/cmdline)    - Check rootdelay= (did the system wait long enough?)    - Check root= (did the system wait for the right device?)  - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=853769e5-1dc5-41be-a689-b430320d207f does not exist. Dropping to a shell! BusyBox v1.22.1 (Ubuntu 1:1.22.0-19ubuntu2) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) == Comment: #7 - Vaishnavi Bhat <vaish123@in.ibm.com> - 2016-10-07 05:37:53 == The blkid output does not show any device with UUID=853769e5-1dc5-41be-a689-b430320d207f which is the root device used in the kexec command line (from kdump-config show) kexec command:   /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.8.0-15-generic root=UUID=853769e5-1dc5-41be-a689-b430320d207f ro xmon=on splash quiet irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz Hence the kdump kernel is failing to boot here. == Comment: #11 - Xue Sheng Li <lixuesh@cn.ibm.com> - 2016-10-17 01:54:56 == recreated with -24 kernel. root@talclp1:~# echo c > /proc/sysrq-trigger [ 72.655416] sysrq: SysRq : Trigger a crash [ 72.655458] Unable to handle kernel paging request for data at address 0x00000000 [ 72.655463] Faulting instruction address: 0xc00000000069d148 [ 72.655469] Oops: Kernel access of bad area, sig: 11 [#1] [ 72.655472] SMP NR_CPUS=2048 NUMA pSeries [ 72.655477] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 nfsv4 nfs cifs fscache binfmt_misc xfs pseries_rng vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq dm_round_robin ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 72.655521] CPU: 25 PID: 9730 Comm: bash Not tainted 4.8.0-24-generic #26-Ubuntu [ 72.655525] task: c0000001d8451e00 task.stack: c0000001d8494000 [ 72.655529] NIP: c00000000069d148 LR: c00000000069e198 CTR: c00000000069d120 [ 72.655534] REGS: c0000001d84979f0 TRAP: 0300 Not tainted (4.8.0-24-generic) [ 72.655537] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001 [ 72.655549] CFAR: c000000000008750 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c00000000069e198 c0000001d8497c70 c000000001476700 0000000000000063 GPR04: c00000047e64aca0 c00000047e65fb40 c00000047df00000 0000000000015ed8 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c00000000069d120 c000000007b3e100 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 0000010036d36398 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffff5582464 0000000000000001 c00000000138e6a0 0000000000000004 GPR28: c00000000138ea60 0000000000000063 c000000001342590 0000000000000000 [ 72.655608] NIP [c00000000069d148] sysrq_handle_crash+0x28/0x30 [ 72.655613] LR [c00000000069e198] __handle_sysrq+0xe8/0x280 [ 72.655616] Call Trace: [ 72.655619] [c0000001d8497c70] [c00000000069e178] __handle_sysrq+0xc8/0x280 (unreliable) [ 72.655625] [c0000001d8497d10] [c00000000069e8ec] write_sysrq_trigger+0x6c/0x90 [ 72.655631] [c0000001d8497d40] [c0000000003a9568] proc_reg_write+0x88/0xd0 [ 72.655637] [c0000001d8497d70] [c00000000030c40c] __vfs_write+0x3c/0x70 [ 72.655642] [c0000001d8497d90] [c00000000030d674] vfs_write+0xd4/0x240 [ 72.655647] [c0000001d8497de0] [c00000000030f1c8] SyS_write+0x68/0x110 [ 72.655652] [c0000001d8497e30] [c000000000009584] system_call+0x38/0xec [ 72.655656] Instruction dump: [ 72.655658] 60000000 60000000 3c4c00de 384295e0 7c0802a6 60000000 3d22001a 3949c8e0 [ 72.655667] 39200001 912a0000 7c0004ac 39400000 <992a0000> 4e800020 3c4c00de 384295b0 [ 72.655677] ---[ end trace 43b490f085103bf5 ]--- [ 72.659366] [ 72.659429] Sending IPI to other CPUs [ 72.660740] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 1.699068] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv4 checksum offload settings. 1 rc=4 [ 1.699093] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv6 checksum offload settings. 1 rc=4 [ 1.699101] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change tso settings. 1 rc=4 [ 2.657700] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.657701] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.657781] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.660641] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.667731] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.677685] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.677688] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.677708] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.697737] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.697743] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.697744] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.697747] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.697749] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.697753] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.699340] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.699360] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 3.350794] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.471468] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.540387] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.628523] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.657731] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.733416] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.752066] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.808884] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.838148] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.919247] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.950262] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.997839] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.007810] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.082174] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.089411] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.162200] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.202441] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.252289] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.279870] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.311712] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.348150] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.402076] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.432069] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.487871] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.518282] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.573338] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.599280] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.632144] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.671142] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.713352] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.782117] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.890336] device-mapper: table: 252:0: multipath: error attaching hardware handler == Comment: #13 - Hari Krishna Bathini <hbathini@in.ibm.com> - 2016-10-19 16:26:57 == (In reply to comment #12) > Hi Hari, > > Can you please take a look at this issue and suggest what would be the next > step ? > We are facing this issue with -24 kernel as well. Can this be a issue with > kdump kernel that has missing multipath modules or some other issue ? > Hi Vaishnavi, Necessary hardware handler modules are missing in the kdump initrd. Here is the console log of kdump kernel that says the same: -- Begin: Loading multipath hardware handlers ... Failure: failed to load module scsi_dh_alua. Failure: failed to load module scsi_dh_rdac. Failure: failed to load module scsi_dh_emc. -- Including this modules explicitly and rebuilding initrd for kdump, able to get to a point where makedumpfile starts to capture dump but fails with:     "get_mem_map: Can't distinguish the memory type." which is already tracked with bug 146571 Thanks Hari PS1: To explicitly add modules to kdump initrd       1. List the necessary modules in /var/lib/kdump/initramfs-tools/modules file       2. mkinitramfs -d /var/lib/kdump/initramfs-tools -o /var/lib/kdump/initrd.img-$kver       3. systemctl restart kdump-tools.service Mirroring this bug to Canonical for their inputs if to include the missing hardware modules to the kdump initrd or to proceed with the workaround.
2017-10-19 12:41:00 Thadeu Lima de Souza Cascardo description [Impact] When the target device where to dump the kernel is under a multipath configuration, dumping will fail, possibly leaving the system stuck in the kdump kernel. The fix is to include some scsi device handlers needed for the multipath setup inside the initramfs image that is used by kdump. All such modules are included. [Test Case] Setting up kdump to target a multipath device using an appropriate storage that requires such scsi_dh modules and triggering a crash will demonstrate that kdump fails. After the fix, it works fine. [Regression Potential] If a bug is introduced, loading kdump might fail, and a crash will not be generated. A worse regression that might be considered is the system is stuck in such a kdump kernel and needs to be rebooted locally (and the crash file is not generated either). But since this is what we are trying to fix, we don't expect other systems to break. One possibility is that the initramfs would be bigger and fail to load. This didn't happen on a small (less than 1GiB of RAM) x86 VM, though. Problem Description ========================== On talclp1, I enabled kdump. But kdump failed and it drop to BusyBox. root@talclp1:~# echo c> /proc/sysrq-trigger [ 132.643690] sysrq: SysRq : Trigger a crash [ 132.643739] Unable to handle kernel paging request for data at address 0x00000000 [ 132.643745] Faulting instruction address: 0xc0000000005c28f4 [ 132.643749] Oops: Kernel access of bad area, sig: 11 [#1] [ 132.643753] SMP NR_CPUS=2048 NUMA pSeries [ 132.643758] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs rpadlpar_io rpaphp rpcsec_gss_krb5 nfsv4 dccp_diag cifs nfs dns_resolver dccp tcp_diag fscache udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xfs libcrc32c pseries_rng rng_core ghash_generic gf128mul vmx_crypto sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic btrfs xor raid6_pq dm_round_robin sr_mod sd_mod cdrom ses enclosure scsi_transport_sas ibmveth crc32c_vpmsum ipr scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath dm_mod [ 132.643819] CPU: 49 PID: 10174 Comm: bash Not tainted 4.8.0-15-generic #16-Ubuntu [ 132.643824] task: c000000111767080 task.stack: c0000000d82e0000 [ 132.643828] NIP: c0000000005c28f4 LR: c0000000005c39d8 CTR: c0000000005c28c0 [ 132.643832] REGS: c0000000d82e3990 TRAP: 0300 Not tainted (4.8.0-15-generic) [ 132.643836] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242422 XER: 00000001 [ 132.643848] CFAR: c0000000000087d0 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c0000000005c39d8 c0000000d82e3c10 c000000000f67b00 0000000000000063 GPR04: c00000011d04a9b8 c00000011d05f7e0 c00000047fb00000 0000000000015998 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c0000000005c28c0 c000000007b4b900 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 000001002b566368 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffffe87a294 0000000000000001 c000000000ebff60 0000000000000004 GPR28: c000000000ec0320 0000000000000063 c000000000e72a90 0000000000000000 [ 132.643906] NIP [c0000000005c28f4] sysrq_handle_crash+0x34/0x50 [ 132.643911] LR [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643914] Call Trace: [ 132.643917] [c0000000d82e3c10] [c000000000a245e8] 0xc000000000a245e8 (unreliable) [ 132.643923] [c0000000d82e3c30] [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643928] [c0000000d82e3cd0] [c0000000005c4188] write_sysrq_trigger+0x78/0xa0 [ 132.643935] [c0000000d82e3d00] [c0000000003ad770] proc_reg_write+0xb0/0x110 [ 132.643941] [c0000000d82e3d50] [c00000000030fc3c] __vfs_write+0x6c/0xe0 [ 132.643946] [c0000000d82e3d90] [c000000000311144] vfs_write+0xd4/0x240 [ 132.643950] [c0000000d82e3de0] [c000000000312e5c] SyS_write+0x6c/0x110 [ 132.643957] [c0000000d82e3e30] [c0000000000095e0] system_call+0x38/0x108 [ 132.643961] Instruction dump: [ 132.643963] 38425240 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 3949ba60 [ 132.643972] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 [ 132.643981] ---[ end trace eed6bbcd2c3bdfdf ]--- [ 132.646105] [ 132.646176] Sending IPI to other CPUs [ 132.647490] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 2.011346] alg: hash: Test 1 failed for crc32c-vpmsum [ 2.729254] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.731554] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.739087] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.739089] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.739110] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.739115] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 2.739122] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.739123] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.739148] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.748938] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.748939] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.748940] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.748942] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.748958] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.748963] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.748978] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.999087] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.119912] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.252513] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.343680] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.381234] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.419515] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.474587] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.482188] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.531439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.552824] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.594489] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.619222] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.672208] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.680298] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.731718] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.761333] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.794955] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.819212] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.871913] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.889439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.922620] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.960707] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.002959] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.035611] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.054476] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.092241] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.099432] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182358] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182823] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.234767] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.333309] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.402827] device-mapper: table: 254:0: multipath: error attaching hardware handler Gave up waiting for root device. Common problems:  - Boot args (cat /proc/cmdline)    - Check rootdelay= (did the system wait long enough?)    - Check root= (did the system wait for the right device?)  - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=853769e5-1dc5-41be-a689-b430320d207f does not exist. Dropping to a shell! BusyBox v1.22.1 (Ubuntu 1:1.22.0-19ubuntu2) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) == Comment: #7 - Vaishnavi Bhat <vaish123@in.ibm.com> - 2016-10-07 05:37:53 == The blkid output does not show any device with UUID=853769e5-1dc5-41be-a689-b430320d207f which is the root device used in the kexec command line (from kdump-config show) kexec command:   /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.8.0-15-generic root=UUID=853769e5-1dc5-41be-a689-b430320d207f ro xmon=on splash quiet irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz Hence the kdump kernel is failing to boot here. == Comment: #11 - Xue Sheng Li <lixuesh@cn.ibm.com> - 2016-10-17 01:54:56 == recreated with -24 kernel. root@talclp1:~# echo c > /proc/sysrq-trigger [ 72.655416] sysrq: SysRq : Trigger a crash [ 72.655458] Unable to handle kernel paging request for data at address 0x00000000 [ 72.655463] Faulting instruction address: 0xc00000000069d148 [ 72.655469] Oops: Kernel access of bad area, sig: 11 [#1] [ 72.655472] SMP NR_CPUS=2048 NUMA pSeries [ 72.655477] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 nfsv4 nfs cifs fscache binfmt_misc xfs pseries_rng vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq dm_round_robin ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 72.655521] CPU: 25 PID: 9730 Comm: bash Not tainted 4.8.0-24-generic #26-Ubuntu [ 72.655525] task: c0000001d8451e00 task.stack: c0000001d8494000 [ 72.655529] NIP: c00000000069d148 LR: c00000000069e198 CTR: c00000000069d120 [ 72.655534] REGS: c0000001d84979f0 TRAP: 0300 Not tainted (4.8.0-24-generic) [ 72.655537] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001 [ 72.655549] CFAR: c000000000008750 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c00000000069e198 c0000001d8497c70 c000000001476700 0000000000000063 GPR04: c00000047e64aca0 c00000047e65fb40 c00000047df00000 0000000000015ed8 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c00000000069d120 c000000007b3e100 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 0000010036d36398 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffff5582464 0000000000000001 c00000000138e6a0 0000000000000004 GPR28: c00000000138ea60 0000000000000063 c000000001342590 0000000000000000 [ 72.655608] NIP [c00000000069d148] sysrq_handle_crash+0x28/0x30 [ 72.655613] LR [c00000000069e198] __handle_sysrq+0xe8/0x280 [ 72.655616] Call Trace: [ 72.655619] [c0000001d8497c70] [c00000000069e178] __handle_sysrq+0xc8/0x280 (unreliable) [ 72.655625] [c0000001d8497d10] [c00000000069e8ec] write_sysrq_trigger+0x6c/0x90 [ 72.655631] [c0000001d8497d40] [c0000000003a9568] proc_reg_write+0x88/0xd0 [ 72.655637] [c0000001d8497d70] [c00000000030c40c] __vfs_write+0x3c/0x70 [ 72.655642] [c0000001d8497d90] [c00000000030d674] vfs_write+0xd4/0x240 [ 72.655647] [c0000001d8497de0] [c00000000030f1c8] SyS_write+0x68/0x110 [ 72.655652] [c0000001d8497e30] [c000000000009584] system_call+0x38/0xec [ 72.655656] Instruction dump: [ 72.655658] 60000000 60000000 3c4c00de 384295e0 7c0802a6 60000000 3d22001a 3949c8e0 [ 72.655667] 39200001 912a0000 7c0004ac 39400000 <992a0000> 4e800020 3c4c00de 384295b0 [ 72.655677] ---[ end trace 43b490f085103bf5 ]--- [ 72.659366] [ 72.659429] Sending IPI to other CPUs [ 72.660740] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 1.699068] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv4 checksum offload settings. 1 rc=4 [ 1.699093] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv6 checksum offload settings. 1 rc=4 [ 1.699101] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change tso settings. 1 rc=4 [ 2.657700] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.657701] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.657781] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.660641] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.667731] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.677685] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.677688] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.677708] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.697737] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.697743] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.697744] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.697747] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.697749] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.697753] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.699340] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.699360] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 3.350794] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.471468] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.540387] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.628523] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.657731] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.733416] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.752066] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.808884] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.838148] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.919247] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.950262] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.997839] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.007810] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.082174] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.089411] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.162200] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.202441] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.252289] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.279870] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.311712] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.348150] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.402076] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.432069] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.487871] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.518282] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.573338] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.599280] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.632144] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.671142] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.713352] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.782117] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.890336] device-mapper: table: 252:0: multipath: error attaching hardware handler == Comment: #13 - Hari Krishna Bathini <hbathini@in.ibm.com> - 2016-10-19 16:26:57 == (In reply to comment #12) > Hi Hari, > > Can you please take a look at this issue and suggest what would be the next > step ? > We are facing this issue with -24 kernel as well. Can this be a issue with > kdump kernel that has missing multipath modules or some other issue ? > Hi Vaishnavi, Necessary hardware handler modules are missing in the kdump initrd. Here is the console log of kdump kernel that says the same: -- Begin: Loading multipath hardware handlers ... Failure: failed to load module scsi_dh_alua. Failure: failed to load module scsi_dh_rdac. Failure: failed to load module scsi_dh_emc. -- Including this modules explicitly and rebuilding initrd for kdump, able to get to a point where makedumpfile starts to capture dump but fails with:     "get_mem_map: Can't distinguish the memory type." which is already tracked with bug 146571 Thanks Hari PS1: To explicitly add modules to kdump initrd       1. List the necessary modules in /var/lib/kdump/initramfs-tools/modules file       2. mkinitramfs -d /var/lib/kdump/initramfs-tools -o /var/lib/kdump/initrd.img-$kver       3. systemctl restart kdump-tools.service Mirroring this bug to Canonical for their inputs if to include the missing hardware modules to the kdump initrd or to proceed with the workaround. [Impact] When the target device where to dump the kernel is under a multipath configuration, dumping will fail, possibly leaving the system stuck in the kdump kernel. The fix is to include some scsi device handlers needed for the multipath setup inside the initramfs image that is used by kdump. All modules currently loaded in the system are included. [Test Case] Setting up kdump to target a multipath device using an appropriate storage that requires such scsi_dh modules and triggering a crash will demonstrate that kdump fails. After the fix, it works fine. [Regression Potential] If a bug is introduced, loading kdump might fail, and a crash will not be generated. A worse regression that might be considered is the system is stuck in such a kdump kernel and needs to be rebooted locally (and the crash file is not generated either). But since this is what we are trying to fix, we don't expect other systems to break. This didn't happen on a small (less than 1GiB of RAM) x86 VM, though. Problem Description ========================== On talclp1, I enabled kdump. But kdump failed and it drop to BusyBox. root@talclp1:~# echo c> /proc/sysrq-trigger [ 132.643690] sysrq: SysRq : Trigger a crash [ 132.643739] Unable to handle kernel paging request for data at address 0x00000000 [ 132.643745] Faulting instruction address: 0xc0000000005c28f4 [ 132.643749] Oops: Kernel access of bad area, sig: 11 [#1] [ 132.643753] SMP NR_CPUS=2048 NUMA pSeries [ 132.643758] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs rpadlpar_io rpaphp rpcsec_gss_krb5 nfsv4 dccp_diag cifs nfs dns_resolver dccp tcp_diag fscache udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xfs libcrc32c pseries_rng rng_core ghash_generic gf128mul vmx_crypto sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic btrfs xor raid6_pq dm_round_robin sr_mod sd_mod cdrom ses enclosure scsi_transport_sas ibmveth crc32c_vpmsum ipr scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath dm_mod [ 132.643819] CPU: 49 PID: 10174 Comm: bash Not tainted 4.8.0-15-generic #16-Ubuntu [ 132.643824] task: c000000111767080 task.stack: c0000000d82e0000 [ 132.643828] NIP: c0000000005c28f4 LR: c0000000005c39d8 CTR: c0000000005c28c0 [ 132.643832] REGS: c0000000d82e3990 TRAP: 0300 Not tainted (4.8.0-15-generic) [ 132.643836] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242422 XER: 00000001 [ 132.643848] CFAR: c0000000000087d0 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c0000000005c39d8 c0000000d82e3c10 c000000000f67b00 0000000000000063 GPR04: c00000011d04a9b8 c00000011d05f7e0 c00000047fb00000 0000000000015998 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c0000000005c28c0 c000000007b4b900 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 000001002b566368 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffffe87a294 0000000000000001 c000000000ebff60 0000000000000004 GPR28: c000000000ec0320 0000000000000063 c000000000e72a90 0000000000000000 [ 132.643906] NIP [c0000000005c28f4] sysrq_handle_crash+0x34/0x50 [ 132.643911] LR [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643914] Call Trace: [ 132.643917] [c0000000d82e3c10] [c000000000a245e8] 0xc000000000a245e8 (unreliable) [ 132.643923] [c0000000d82e3c30] [c0000000005c39d8] __handle_sysrq+0xe8/0x280 [ 132.643928] [c0000000d82e3cd0] [c0000000005c4188] write_sysrq_trigger+0x78/0xa0 [ 132.643935] [c0000000d82e3d00] [c0000000003ad770] proc_reg_write+0xb0/0x110 [ 132.643941] [c0000000d82e3d50] [c00000000030fc3c] __vfs_write+0x6c/0xe0 [ 132.643946] [c0000000d82e3d90] [c000000000311144] vfs_write+0xd4/0x240 [ 132.643950] [c0000000d82e3de0] [c000000000312e5c] SyS_write+0x6c/0x110 [ 132.643957] [c0000000d82e3e30] [c0000000000095e0] system_call+0x38/0x108 [ 132.643961] Instruction dump: [ 132.643963] 38425240 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 3949ba60 [ 132.643972] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 [ 132.643981] ---[ end trace eed6bbcd2c3bdfdf ]--- [ 132.646105] [ 132.646176] Sending IPI to other CPUs [ 132.647490] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 2.011346] alg: hash: Test 1 failed for crc32c-vpmsum [ 2.729254] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.731554] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.739087] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.739089] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.739110] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.739115] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 2.739122] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.739123] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.739148] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.748938] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.748939] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.748940] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.748942] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.748958] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.748963] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.748978] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.999087] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.119912] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.252513] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.343680] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.381234] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.419515] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.474587] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.482188] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.531439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.552824] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.594489] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.619222] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.672208] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.680298] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.731718] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.761333] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.794955] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.819212] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.871913] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.889439] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 3.922620] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 3.960707] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.002959] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.035611] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.054476] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.092241] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.099432] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182358] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.182823] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.234767] device-mapper: table: 254:1: multipath: error attaching hardware handler [ 4.333309] device-mapper: table: 254:0: multipath: error attaching hardware handler [ 4.402827] device-mapper: table: 254:0: multipath: error attaching hardware handler Gave up waiting for root device. Common problems:  - Boot args (cat /proc/cmdline)    - Check rootdelay= (did the system wait long enough?)    - Check root= (did the system wait for the right device?)  - Missing modules (cat /proc/modules; ls /dev) ALERT! UUID=853769e5-1dc5-41be-a689-b430320d207f does not exist. Dropping to a shell! BusyBox v1.22.1 (Ubuntu 1:1.22.0-19ubuntu2) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) == Comment: #7 - Vaishnavi Bhat <vaish123@in.ibm.com> - 2016-10-07 05:37:53 == The blkid output does not show any device with UUID=853769e5-1dc5-41be-a689-b430320d207f which is the root device used in the kexec command line (from kdump-config show) kexec command:   /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.8.0-15-generic root=UUID=853769e5-1dc5-41be-a689-b430320d207f ro xmon=on splash quiet irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz Hence the kdump kernel is failing to boot here. == Comment: #11 - Xue Sheng Li <lixuesh@cn.ibm.com> - 2016-10-17 01:54:56 == recreated with -24 kernel. root@talclp1:~# echo c > /proc/sysrq-trigger [ 72.655416] sysrq: SysRq : Trigger a crash [ 72.655458] Unable to handle kernel paging request for data at address 0x00000000 [ 72.655463] Faulting instruction address: 0xc00000000069d148 [ 72.655469] Oops: Kernel access of bad area, sig: 11 [#1] [ 72.655472] SMP NR_CPUS=2048 NUMA pSeries [ 72.655477] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 nfsv4 nfs cifs fscache binfmt_misc xfs pseries_rng vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor raid6_pq dm_round_robin ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c crc32c_vpmsum scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 72.655521] CPU: 25 PID: 9730 Comm: bash Not tainted 4.8.0-24-generic #26-Ubuntu [ 72.655525] task: c0000001d8451e00 task.stack: c0000001d8494000 [ 72.655529] NIP: c00000000069d148 LR: c00000000069e198 CTR: c00000000069d120 [ 72.655534] REGS: c0000001d84979f0 TRAP: 0300 Not tainted (4.8.0-24-generic) [ 72.655537] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001 [ 72.655549] CFAR: c000000000008750 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c00000000069e198 c0000001d8497c70 c000000001476700 0000000000000063 GPR04: c00000047e64aca0 c00000047e65fb40 c00000047df00000 0000000000015ed8 GPR08: 0000000000000007 0000000000000001 0000000000000000 0000000000000001 GPR12: c00000000069d120 c000000007b3e100 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 0000010036d36398 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffff5582464 0000000000000001 c00000000138e6a0 0000000000000004 GPR28: c00000000138ea60 0000000000000063 c000000001342590 0000000000000000 [ 72.655608] NIP [c00000000069d148] sysrq_handle_crash+0x28/0x30 [ 72.655613] LR [c00000000069e198] __handle_sysrq+0xe8/0x280 [ 72.655616] Call Trace: [ 72.655619] [c0000001d8497c70] [c00000000069e178] __handle_sysrq+0xc8/0x280 (unreliable) [ 72.655625] [c0000001d8497d10] [c00000000069e8ec] write_sysrq_trigger+0x6c/0x90 [ 72.655631] [c0000001d8497d40] [c0000000003a9568] proc_reg_write+0x88/0xd0 [ 72.655637] [c0000001d8497d70] [c00000000030c40c] __vfs_write+0x3c/0x70 [ 72.655642] [c0000001d8497d90] [c00000000030d674] vfs_write+0xd4/0x240 [ 72.655647] [c0000001d8497de0] [c00000000030f1c8] SyS_write+0x68/0x110 [ 72.655652] [c0000001d8497e30] [c000000000009584] system_call+0x38/0xec [ 72.655656] Instruction dump: [ 72.655658] 60000000 60000000 3c4c00de 384295e0 7c0802a6 60000000 3d22001a 3949c8e0 [ 72.655667] 39200001 912a0000 7c0004ac 39400000 <992a0000> 4e800020 3c4c00de 384295b0 [ 72.655677] ---[ end trace 43b490f085103bf5 ]--- [ 72.659366] [ 72.659429] Sending IPI to other CPUs [ 72.660740] IPI complete I'm in purgatory  -> smp_release_cpus() spinning_secondaries = 104  <- smp_release_cpus() [ 1.699068] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv4 checksum offload settings. 1 rc=4 [ 1.699093] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change IPv6 checksum offload settings. 1 rc=4 [ 1.699101] ibmveth 30000002 (unnamed net_device) (uninitialized): unable to change tso settings. 1 rc=4 [ 2.657700] sd 0:2:1:0: [sdb] Assuming drive cache: write through [ 2.657701] sd 0:2:0:0: [sda] Assuming drive cache: write through [ 2.657781] sd 0:2:2:0: [sdc] Assuming drive cache: write through [ 2.660641] sd 0:2:7:0: [sdh] Assuming drive cache: write through [ 2.667731] sd 0:2:4:0: [sde] Assuming drive cache: write through [ 2.677685] sd 0:2:6:0: [sdg] Assuming drive cache: write through [ 2.677688] sd 0:2:5:0: [sdf] Assuming drive cache: write through [ 2.677708] sd 0:2:3:0: [sdd] Assuming drive cache: write through [ 2.697737] sd 1:2:6:0: [sdo] Assuming drive cache: write through [ 2.697743] sd 1:2:1:0: [sdj] Assuming drive cache: write through [ 2.697744] sd 1:2:4:0: [sdm] Assuming drive cache: write through [ 2.697747] sd 1:2:2:0: [sdk] Assuming drive cache: write through [ 2.697749] sd 1:2:3:0: [sdl] Assuming drive cache: write through [ 2.697753] sd 1:2:5:0: [sdn] Assuming drive cache: write through [ 2.699340] sd 1:2:7:0: [sdp] Assuming drive cache: write through [ 2.699360] sd 1:2:0:0: [sdi] Assuming drive cache: write through [ 3.350794] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.471468] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.540387] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.628523] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.657731] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.733416] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.752066] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.808884] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.838148] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.919247] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 3.950262] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 3.997839] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.007810] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.082174] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.089411] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.162200] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.202441] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.252289] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.279870] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.311712] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.348150] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.402076] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.432069] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.487871] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.518282] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.573338] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.599280] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.632144] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.671142] device-mapper: table: 252:1: multipath: error attaching hardware handler [ 4.713352] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.782117] device-mapper: table: 252:0: multipath: error attaching hardware handler [ 4.890336] device-mapper: table: 252:0: multipath: error attaching hardware handler == Comment: #13 - Hari Krishna Bathini <hbathini@in.ibm.com> - 2016-10-19 16:26:57 == (In reply to comment #12) > Hi Hari, > > Can you please take a look at this issue and suggest what would be the next > step ? > We are facing this issue with -24 kernel as well. Can this be a issue with > kdump kernel that has missing multipath modules or some other issue ? > Hi Vaishnavi, Necessary hardware handler modules are missing in the kdump initrd. Here is the console log of kdump kernel that says the same: -- Begin: Loading multipath hardware handlers ... Failure: failed to load module scsi_dh_alua. Failure: failed to load module scsi_dh_rdac. Failure: failed to load module scsi_dh_emc. -- Including this modules explicitly and rebuilding initrd for kdump, able to get to a point where makedumpfile starts to capture dump but fails with:     "get_mem_map: Can't distinguish the memory type." which is already tracked with bug 146571 Thanks Hari PS1: To explicitly add modules to kdump initrd       1. List the necessary modules in /var/lib/kdump/initramfs-tools/modules file       2. mkinitramfs -d /var/lib/kdump/initramfs-tools -o /var/lib/kdump/initrd.img-$kver       3. systemctl restart kdump-tools.service Mirroring this bug to Canonical for their inputs if to include the missing hardware modules to the kdump initrd or to proceed with the workaround.
2017-10-19 12:45:12 Thadeu Lima de Souza Cascardo attachment added update for xenial https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1635597/+attachment/4975868/+files/makedumpfile.debdiff
2017-10-19 16:19:16 Ubuntu Foundations Team Bug Bot tags architecture-ppc64le bugnameltc-146907 severity-high targetmilestone-inin--- triage-r architecture-ppc64le bugnameltc-146907 patch severity-high targetmilestone-inin--- triage-r
2017-10-19 16:19:17 Ubuntu Foundations Team Bug Bot bug added subscriber Joseph Salisbury
2017-11-06 14:55:37 Andrew Cloke summary Ubuntu16.10:talclp1: Kdump failed with multipath disk Ubuntu:talclp1: Kdump failed with multipath disk
2017-11-28 17:06:07 Andrew Cloke tags architecture-ppc64le bugnameltc-146907 patch severity-high targetmilestone-inin--- triage-r architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-r
2017-12-14 21:37:28 Brian Murray makedumpfile (Ubuntu Xenial): status Confirmed Fix Committed
2017-12-14 21:37:32 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2017-12-14 21:37:38 Brian Murray bug added subscriber SRU Verification
2017-12-14 21:37:43 Brian Murray tags architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-r architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-r verification-needed verification-needed-xenial
2017-12-18 14:16:50 Andrew Cloke tags architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-r verification-needed verification-needed-xenial architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-a verification-needed verification-needed-xenial
2018-02-26 15:19:25 Manoj Iyer ubuntu-power-systems: status Confirmed Incomplete
2018-03-05 15:44:40 Manoj Iyer makedumpfile (Ubuntu Zesty): status Confirmed Won't Fix
2018-03-05 15:44:43 Manoj Iyer linux (Ubuntu Zesty): status New Won't Fix
2018-03-05 15:45:34 Andrew Cloke tags architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-a verification-needed verification-needed-xenial architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-g verification-needed verification-needed-xenial
2018-03-15 12:01:05 bugproxy attachment added blkid ouput https://bugs.launchpad.net/bugs/1635597/+attachment/5080257/+files/file_146907.txt
2018-03-15 12:01:07 bugproxy attachment added kdump kernel boots fine after including necessary harware handler modules https://bugs.launchpad.net/bugs/1635597/+attachment/5080258/+files/crashkernel_including-necessary-modules.log
2018-03-15 12:01:09 bugproxy attachment added kdump boot log on talclp3 https://bugs.launchpad.net/bugs/1635597/+attachment/5080259/+files/kdump_validation-talclp3.log
2018-03-15 12:01:10 bugproxy attachment added update for xenial https://bugs.launchpad.net/bugs/1635597/+attachment/5080260/+files/makedumpfile.debdiff
2018-03-19 13:56:54 Manoj Iyer tags architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-g verification-needed verification-needed-xenial architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-g verification-done verification-done-xenial
2018-03-19 13:57:21 Andrew Cloke ubuntu-power-systems: status Incomplete Fix Committed
2018-03-22 17:10:54 dann frazier bug added subscriber dann frazier
2018-03-26 09:17:46 Launchpad Janitor makedumpfile (Ubuntu Xenial): status Fix Committed Fix Released
2018-03-26 09:17:50 Łukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team
2018-03-26 10:31:41 bugproxy tags architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin--- triage-g verification-done verification-done-xenial architecture-ppc64le bugnameltc-146907 patch ppc64el-kdump severity-high targetmilestone-inin16044 triage-g verification-done verification-done-xenial
2018-04-05 16:19:31 Manoj Iyer makedumpfile (Ubuntu Trusty): status Confirmed Won't Fix
2018-04-05 16:19:48 Manoj Iyer linux (Ubuntu Trusty): status New Won't Fix
2018-04-16 14:21:03 Manoj Iyer ubuntu-power-systems: status Fix Committed Fix Released