Comment 1 for bug 1461429

Revision history for this message
Louis Bouchard (louis) wrote : Re: ceph-osd early start intefere with kdump-tools during kernel dump

Here is an example of a captured session :

[ 399.597207] SysRq : Trigger a crash
[ 399.599050] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 399.600421] IP: [<ffffffff81457fc6>] sysrq_handle_crash+0x16/0x20
[ 399.600745] PGD 3758e067 PUD 3d616067 PMD 0
[ 399.600745] Oops: 0002 [#1] SMP
...
 * Starting enable remaining boot-time encrypted block devices[74G[ OK ]
Cloud-init v. 0.7.5 running 'init' at Wed, 03 Jun 2015 07:33:31 +0000. Up 8.83 seconds.
ci-info: ++++++++++++++++++++++++Net device info++++++++++++++++++++++++
ci-info: +--------+------+-----------+-------------+-------------------+
ci-info: | Device | Up | Address | Mask | Hw-Address |
ci-info: +--------+------+-----------+-------------+-------------------+
ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | . |
ci-info: | eth0 | True | 10.5.1.39 | 255.255.0.0 | fa:16:3e:09:27:0b |
ci-info: +--------+------+-----------+-------------+-------------------+
ci-info: +++++++++++++++++++++++++++++Route info+++++++++++++++++++++++++++++
ci-info: +-------+-------------+----------+-------------+-----------+-------+
ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
ci-info: +-------+-------------+----------+-------------+-----------+-------+
ci-info: | 0 | 0.0.0.0 | 10.5.0.1 | 0.0.0.0 | eth0 | UG |
ci-info: | 1 | 10.5.0.0 | 0.0.0.0 | 255.255.0.0 | eth0 | U |
ci-info: +-------+-------------+----------+-------------+-----------+-------+
 * Starting Ceph OSD[74G[ OK ] <<<<<<<<<<<<<<<
 * Starting Signal sysvinit that local filesystems are mounted[74G[ OK ]
 * Starting configure network device security[74G[ OK ]
 * Starting flush early job output to logs[74G[ OK ]
...
 * Starting Ceph OSD (start all instances)[74G[ OK ]
 * Starting regular background program processing daemon[74G[ OK ]
 * Starting deferred execution scheduler[74G[ OK ]
Starting kdump-tools: * Starting Ceph MON (start all instances)[74G[ OK ]
 * Stopping save kernel messages[74G[ OK ]
 * Starting Ceph MON[74G[ OK ]
 * Starting automatic crash report generation[74G[ OK ]
 * Stopping CPU interrupts balancing daemon[74G[ OK ]
 * running makedumpfile -c -d 31 /proc/vmcore /var/crash/201506030733/dump-incomplete
 * Stopping Ceph MON (start all instances)[74G[ OK ]
 * Starting Ceph monitor (all instances)[74G[ OK ]
The kernel version is not supported.
The created dumpfile may be incomplete.
cyclic buffer size has been changed: 32767 => 32640
Excluding unnecessary pages : [ 0.0 %] /
* Starting Create Ceph client.admin key when possible[74G[ OK ]
Excluding unnecessary pages : [100.0 %] |
Excluding unnecessary pages : [100.0 %] \
Excluding unnecessary pages : [100.0 %] -
Excluding unnecessary pages : [100.0 %] /
Excluding unnecessary pages : [100.0 %] |
 * Starting OpenSSH server[74G[ OK ]
Copying data : [ 9.4 %] \
Copying data : [ 25.1 %] -
Copying data : [ 47.2 %] /
Copying data : [ 59.8 %] |
Copying data : [ 70.5 %] \
Copying data : [ 93.6 %] -
* Stopping Read required files in advance (for other mountpoints)[74G[ OK ]
Excluding unnecessary pages : [100.0 %] /
Copying data : [100.0 %] |

The dumpfile is saved to /var/crash/201506030733/dump-incomplete.

makedumpfile Completed.
 * kdump-tools: saved vmcore in /var/crash/201506030733
 * running makedumpfile --dump-dmesg /proc/vmcore /var/crash/201506030733/dmesg.201506030733
The kernel version is not supported.
The created dumpfile may be incomplete.

The dmesg log is saved to /var/crash/201506030733/dmesg.201506030733.

makedumpfile Completed.
 * kdump-tools: saved dmesg content in /var/crash/201506030733
Wed, 03 Jun 2015 07:33:49 +0000

We can clearly see the Ceph OSD start messages during the kernel dump