ceph-osd early start interferes with kdump-tools during kernel dump
Bug #1461429 reported by
Louis Bouchard
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ceph (Ubuntu) |
Invalid
|
Medium
|
Unassigned | ||
Trusty |
Fix Released
|
Medium
|
Unassigned |
Bug Description
When a kernel crash dump occurs on a system with kdump-tools configured and enabled, kexec triggers a reboot of the server which will start kdump-tools to capture the kernel dump.
On systems running CEPH osd's, the configured osds will start even if kdump-tools is setup to start very early in the boot phase. Even replacing the kdump-tools sysVinit script by an upstart job that runs before the runlevel signal is not sufficient.
The /etc/init/
Changed in ceph (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → Medium |
assignee: | nobody → Louis Bouchard (louis-bouchard) |
summary: |
- ceph-osd early start intefere with kdump-tools during kernel dump + ceph-osd early start interferes with kdump-tools during kernel dump |
Changed in ceph (Ubuntu Trusty): | |
status: | Confirmed → Fix Released |
assignee: | Louis Bouchard (louis-bouchard) → nobody |
To post a comment you must log in.
Here is an example of a captured session :
[ 399.597207] SysRq : Trigger a crash fc6>] sysrq_handle_ crash+0x16/ 0x20 +++++++ +++++++ +++Net device info+++ +++++++ +++++++ +++++++ --+---- --+---- ------- +------ ------- +------ ------- ------+ --+---- --+---- ------- +------ ------- +------ ------- ------+ --+---- --+---- ------- +------ ------- +------ ------- ------+ +++++++ +++++++ +++++++ +Route info+++ +++++++ +++++++ +++++++ +++++ -+----- ------- -+----- -----+- ------- -----+- ------- ---+--- ----+ -+----- ------- -+----- -----+- ------- -----+- ------- ---+--- ----+ -+----- ------- -+----- -----+- ------- -----+- ------- ---+--- ----+ 201506030733/ dump-incomplete
[ 399.599050] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 399.600421] IP: [<ffffffff81457
[ 399.600745] PGD 3758e067 PUD 3d616067 PMD 0
[ 399.600745] Oops: 0002 [#1] SMP
...
* Starting enable remaining boot-time encrypted block devices[74G[ OK ]
Cloud-init v. 0.7.5 running 'init' at Wed, 03 Jun 2015 07:33:31 +0000. Up 8.83 seconds.
ci-info: +++++++
ci-info: +------
ci-info: | Device | Up | Address | Mask | Hw-Address |
ci-info: +------
ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | . |
ci-info: | eth0 | True | 10.5.1.39 | 255.255.0.0 | fa:16:3e:09:27:0b |
ci-info: +------
ci-info: +++++++
ci-info: +------
ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
ci-info: +------
ci-info: | 0 | 0.0.0.0 | 10.5.0.1 | 0.0.0.0 | eth0 | UG |
ci-info: | 1 | 10.5.0.0 | 0.0.0.0 | 255.255.0.0 | eth0 | U |
ci-info: +------
* Starting Ceph OSD[74G[ OK ] <<<<<<<<<<<<<<<
* Starting Signal sysvinit that local filesystems are mounted[74G[ OK ]
* Starting configure network device security[74G[ OK ]
* Starting flush early job output to logs[74G[ OK ]
...
* Starting Ceph OSD (start all instances)[74G[ OK ]
* Starting regular background program processing daemon[74G[ OK ]
* Starting deferred execution scheduler[74G[ OK ]
Starting kdump-tools: * Starting Ceph MON (start all instances)[74G[ OK ]
* Stopping save kernel messages[74G[ OK ]
* Starting Ceph MON[74G[ OK ]
* Starting automatic crash report generation[74G[ OK ]
* Stopping CPU interrupts balancing daemon[74G[ OK ]
* running makedumpfile -c -d 31 /proc/vmcore /var/crash/
* Stopping Ceph MON (start all instances)[74G[ OK ]
* Starting Ceph monitor (all instances)[74G[ OK ]
The kernel version is not supported.
The created dumpfile may be incomplete.
cyclic buffer size has been changed: 32767 => 32640
Excluding unnecessary pages : [ 0.0 %] /
* Starting Create Ceph client.admin key when possible[74G[ OK ]
Excluding unnecessary pages : [100.0 %] |
Excluding unnecessary pages : [100.0 %] \
Excluding unnecessary pages : [100.0 %] -
Excluding unnecessary pages : [100.0 %] /
Excluding unnecessary pages : [100.0 %] |
* Starting OpenSSH server[74G[ OK ]
Copying data : [ 9.4 %] \
Copying data : [ 25.1 %] -
Copying data : [ 47.2 %] /
Copying data : [...