EXT4-fs Kernel Panic when Delete/Reset Environment - remove nodes

Bug #1538587 reported by Daniel Smith
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Dmitry Guryanov
7.0.x
Fix Released
High
Sergii Rizvan
8.0.x
Fix Released
High
Fuel Python (Deprecated)
Mitaka
Fix Released
High
Dmitry Guryanov

Bug Description

ISO USED: fuel-community-8.0-1399-2016-01-19_07-14-18.iso

Actions:

Login to fuel master
   - Create environment with default settings
   - Add Plugin to environment (note: issue is seen with/without plugins)
   - Add nodes 1 controller, 2 computes
    - Deploy
    - Click Reset or Delete Environment.
          - Enter Env name and press OK.

Result:

The nodes are not deleted.
Fuel UI hangs on Deleted.
Console of Slave Nodes (Compute/Controllers) show an ext4-fs kernel panic
          - Manual Reset of Blade is required to recover.

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

There is a code which forces reboot if kernel got a panic during node removal:

https://github.com/openstack/fuel-astute/blob/7cea69bd021d6132cde473cdd25e5142fdddad95/mcagents/erase_node.rb#L110-L112

Looks like it doesn't work or it wasn't executed in your case.

Changed in fuel:
assignee: nobody → Fuel Python Team (fuel-python)
importance: Undecided → High
milestone: none → 9.0
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hey Alexander Gordeev.

I have let the nodes sit there for near on an hour and the signal reset doesnt seem to get called..

Revision history for this message
Alexander Kislitsky (akislitsky) wrote :

@Daniel, could you please provide us with diagnostic snapshot.

tags: added: area-python
Changed in fuel:
status: New → Confirmed
Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hello Alexander.

What do you need in terms of diagnostics?

just "fuel diag" or screen shot from the console when the panic occurs or?

Send me the commands you want output from and ill supply it.

Cheers,
Daniel

Revision history for this message
Michal Ptacek (michalx-ptacek) wrote :

attaching screenshot from stack dump

Revision history for this message
Alexander Kislitsky (akislitsky) wrote :

@Daniel, please generate fuel-snapshot in UI on the support tab. I can't reproduce this issue. Thus without any logs from your environment we can't find the issue.

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → MOS Linux (mos-linux)
Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@Alexander - the snapshot is 339MB large

Should i post this here?

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

@Michal Ptacek, well, the last message on your pic says "Rebooting in 10 seconds".

So, the expected reboot never occurred, right?

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@alexander - Chiming in for Michal (since we are looking at the same lab) - that is correct, the blade sit and hang there.

Cheers,
D

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@Alexander Gordeev

I have tried to post the snapshot a couple times now, but the server is timing out (i suspect cause the snap shot is so large).. can you provide another method to post this out.

Revision history for this message
Aleksander Mogylchenko (amogylchenko) wrote :

Any public service will do - you might try Google Drive or Microsoft One.

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@ Alexander Gordeev

hey there.

Here is the gdrive link.. let me know if you have any issues fetching it.

https://drive.google.com/file/d/0B96eAbbIavDscUJGYV8tU3ZPYzA/view?usp=sharing

Cheers,
D

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
status: Incomplete → Confirmed
tags: added: area-mos
removed: area-python
tags: added: area-linux
removed: area-mos
Revision history for this message
Dmitry Teselkin (teselkin-d) wrote :

@Daniel, could you please try to reproduce reboot after panic manually?
To do this, please run the following commands in console on one of the nodes that hang:

echo 1 > /proc/sys/kernel/sysrq

echo 1 > /proc/sys/kernel/panic_on_oops

echo 15 > /proc/sys/kernel/panic

echo c > /proc/sysrq-trigger

It should cause kernel panic, and then the node should be restarted in 15 seconds.
Please confirm it does.

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hey there.

When the panic occurs, it is happening after TTY is shutdown so i dont have a console on the node / cant do anything at command like.

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Sorry - perhaps i misunderstood.

are you wanting me to take a "normally" running node (rebooted) and just run the commands to prove the signals - manually created panic causes the reboot to occur?

I will try that now..

Revision history for this message
Dmitry Teselkin (teselkin-d) wrote :

Yes, run the commands on normally running node, to check if the issue is related to the state that caused failure, or to the node itself.

BTW, do you know that BIOS quite old on nodes 29 and 30? They run version from 2013, but latest version is from 11 Nov 2015.

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hey there.

Thanks! I will give it a try.

For the BIOS< we are aware and would like to upgrade, however with that chassis, we cant do blades at a time (have to do the whole OA and all the blades at once) - this means bringing everyone's labs down - so we are sort of stuck until our OPNFV release is out and I can upgrade those guys..

Are you thinking that its a BIOS issue ?

D

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Just FYI ( about potential BIOS issue).

We have been using these blades for FUEL 6, 6.1 and 7.0 and didnt see this issue before starting with 8.. not to say that a BIOS update isnt a good idea (and needed), just to outline that this really started to be seen on a regular basis with 8.

Cheers,
D

- and the screen shots are coming .

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hey there.

I didnt catch the screen shot, however i did see the following (i will redo it with a video this time and post the link).

After entering the last command a panic was triggered - however this was not a ext4 panic (as seen prior).

Saw the "rebooting in 15 seconds code"
Node rebooted as expected.

So - im wondering if its the ext4fs panic that is causing the reboot to fail - perhaps the "reboot" call is done by a script and since the filesystem is not there, the script is no longer available?

will post the video link shortly.

thanks everyone,
Daniel

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

Hey there.

here is a video of the commands executed and the result.

https://drive.google.com/file/d/0B96eAbbIavDsWW1wZ2VqZ2Jnb1E/view?usp=sharing

Perhaps we can cause a EXT4FS panic (what we see when delete/reset environment is sent from FUEL) and see if we can reproduce the problem? (im trying to figure out a way to do that - my SIGNAL's knowledge isnt what it should be)

Cheers,
Daniel

tags: added: move-to-mu
Revision history for this message
Ivan Suzdal (isuzdal) wrote :

I've a few questions:
1) Is this error reproduce each time when you try to reset/delete environment?
2) Is it reproduce only on a certain hardware or any another also affected?
3) Could you provide output from
'cat /proc/cmdline; sysctl kernel.sysrq kernel.panic kernel.panic_on_oops; uname -r'
on node before you try to delete/reset environment?

Changed in fuel:
assignee: MOS Linux (mos-linux) → Ivan Suzdal (isuzdal)
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

The node removal code in astute (https://github.com/openstack/fuel-astute/blob/master/mcagents/erase_node.rb#L107-L150) is all but crazy. It tries to wipe out the drive despite the root filesystem
being backed by that very drive. Astute should be doing something smarter, say

a) Fetch the latest bootstrap kernel and initramfs from the master node
b) configure grub to boot those and reboot, or boot the bootstrap via kexec

Changed in fuel:
assignee: Ivan Suzdal (isuzdal) → nobody
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

The kernel correctly complains about filesystem metadata corruption (no, https://github.com/openstack/fuel-astute/blob/master/mcagents/erase_node.rb#L152-L160 is NOT enough, the kernel can't ignore all filesystem problems).
Please assign this bug to astute developers.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Fuel team, please take a look at Alexey's comments ^ - looks like we need to fix astute here.

If this leaves nodes unusable after cluster reset that sounds Critical to me.

tags: added: promoted-to-critical
removed: move-to-mu
tags: added: area-library
removed: area-linux
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This bug is not in progress yet. It will not make 8.0 HCF tomorrow. Moving to 8.0-updates

tags: added: move-to-mu
removed: promoted-to-critical
Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@Ivan Sudzal

Answers to your queries below:

1) Is this error reproduce each time when you try to reset/delete environment?
       - with consistency on Gen8 BL460 (again - we know that a Firmware update is out there)
                - Not seen in Fuel 6.0, 6.1, 7.0 (or for us ARNO, ARNO SR1, Brahmaputra-del)

       - seen occasionally on Gen 9 Bl460 - Nov 2015 FM installed (newly commissioned chassis

2) Is it reproduce only on a certain hardware or any another also affected?
      - see above - i only have two sets of HW under my control. so I cant speak to more model than this.
3) Could you provide output from
'cat /proc/cmdline; sysctl kernel.sysrq kernel.panic kernel.panic_on_oops; uname -r'
on node before you try to delete/reset environment?

       - The system is in use this morning, when the day is over and the people are done, I will fun the commands, send the output.

Cheers,
Daniel

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

#25 points us to redirect to astute team. Moving to fuel-python

tags: added: area-python module-astute team-bugfix
removed: area-library
Revision history for this message
Alex Schultz (alex-schultz) wrote :

Here's a bit of the history behing the error=panic and this code. The reason why the error=panic is to prevent nodes from remaining online in case of filesystem going read-only. For a controller this is very bad because mysql/rabbitmq may still try to function and cause issues. The erase node code has been in place for some time now, so perhaps this is an issue with a newer ubuntu kernel? As part of the erase node code, we are disabling the error=panic because what the erase node process does is corrupt the filesystem. So to comment #25, it must ignore the corruption because thats what we are intentionally doing. Do you see something specifically wrong with that function meaning it's not properly doing what it supposed to be doing? Perhaps we're not properly disabling the error=panic for that hardware. Are we missing a device in that dev lookup?

@Daniel, can you provide a device listing for us?

Revision history for this message
Daniel Smith (daniel-smith-v) wrote :

@Alex Schultz

you want lspci or something more?

Revision history for this message
Evgeniy L (rustyrobot) wrote :

So at the moment we have several options what to do here

1. pass from Nailgun to Astute during the node deletion a link to kernel/initramfs which will be used to boot into using kexec, and after that MCollective with the same config should be started (timeouts/retries should be set appropriately) to perform disks erase, after that Astute restarts the node using MCollective.

2. create a flag for the env, which defines if disks on the nodes in the env should be erased.

1st solution is very invasive for a bug fix and may lead to plenty of problems. Also proper versioning should be implemented (as far as I remember for different versions of release we had different links to bootstrap).
2nd is not a full solution, it's a workaround, for those users who have problems with panics, but requirement will be to have proper boot ordering in bios.

Andrey Maximov (maximov)
tags: added: move-to-9.0
removed: move-to-mu
Revision history for this message
Dmitriy Novakovskiy (dnovakovskiy) wrote :

Please mark this issue (and possible workarounds) in RN (incl. info about blade models), then we can move it to mu/9.0

Dmitry Pyzhov (dpyzhov)
tags: added: release-notes
tags: added: 8.0 release-notes-done
removed: release-notes
Dmitry Pyzhov (dpyzhov)
tags: added: feature-reset-env
tags: removed: feature-reset-env
Revision history for this message
Dmitry Guryanov (dguryanov) wrote :

Here is a solution how to umount root fs - http://www.ivarch.com/blogs/oss/2007/01/resize-a-live-root-fs-a-howto.shtml.

I haven't tried it yet, but it seems to be working. So to fix the bug we can unmount all filesystems and then erase first blocks.

Revision history for this message
Dmitry Guryanov (dguryanov) wrote :

I've managed to umount root fs on a live system

actions:
1. make some root fs on tmpfs (I copied files from live system)
2. stop all services and processes except ssh
3. move all mounts to a new root fs (with mount --move)
4. do pivot_root /newroot/ /newroot/oldroot
5. reexecute init with telinit u
6. restart ssh
7. close ssh session and login again
8. it's now possible to umount /oldroot

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/297669

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/297669
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=e770d4ec7d302e958ffae8db87e633e9d5e3db91
Submitter: Jenkins
Branch: master

commit e770d4ec7d302e958ffae8db87e633e9d5e3db91
Author: Dmitry Guryanov <email address hidden>
Date: Fri Mar 25 16:19:29 2016 +0300

    Erase only boot code in MBR during node delete

    Zeroing first 1Mb of a block device will almost always
    lead to serios system error, if there is a filesystem
    on this block device and this filesystem is mounted.

    This code had 2 points: to prevent system from booting
    from hard drive after reboot and to prevent anaconda
    (or other OS installer) to automatically recognize
    disk structure. But since we only use image-based provision
    this is not the case anymore.

    It's enough to clear first 446 bytes of MBR where the first
    stage of a bootloader resides. So we will also be able to
    check nodes disks and filesystems after deleting it from
    cluster for troubleshooting for example, because partition
    scheme will not be touched.

    https://wiki.archlinux.org/index.php/Master_Boot_Record

    Change-Id: I60f2b250d6d5997b4b471c27ed00d057ae73c30a
    Fixes-bug: #1538587

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Oh no!

oflag=direct isn't working.

proof:

http://paste.openstack.org/show/495181/

i was able to load node from local disk. After i executed dd command without oflag=direct, than, it finally stops booting from local disk.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/309537

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/310521

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/309537
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=9a547037efb6a5dab041a69e47bba786c76508ae
Submitter: Jenkins
Branch: master

commit 9a547037efb6a5dab041a69e47bba786c76508ae
Author: Dmitry Guryanov <email address hidden>
Date: Fri Apr 22 20:53:23 2016 +0300

    Don't use oflag=direct while erasing bootloader code

    We are erasing 446 bytes of bootloader code before rebooting
    node, but you can write only blocks with O_DIRECT, so let's
    write 446 bytes without oflag=direct and do sync after dd
    instead.

    Change-Id: I7b5f9749b170a9b7191991dd02cb867bba0e4c43
    Partial-Bug: #1573105
    Closes-Bug: #1538587

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/mitaka)

Reviewed: https://review.openstack.org/310521
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=8d7aca3110ddb2af0e00d22a2ea4d1802cfd996e
Submitter: Jenkins
Branch: stable/mitaka

commit 8d7aca3110ddb2af0e00d22a2ea4d1802cfd996e
Author: Dmitry Guryanov <email address hidden>
Date: Fri Apr 22 20:53:23 2016 +0300

    Don't use oflag=direct while erasing bootloader code

    We are erasing 446 bytes of bootloader code before rebooting
    node, but you can write only blocks with O_DIRECT, so let's
    write 446 bytes without oflag=direct and do sync after dd
    instead.

    Change-Id: I7b5f9749b170a9b7191991dd02cb867bba0e4c43
    Partial-Bug: #1573105
    Closes-Bug: #1538587
    (cherry picked from commit d25a0004d893505b0d3dad8fb09ef9be5a2a40bd)

Revision history for this message
Maksym Strukov (unbelll) wrote :

Verified as fixed in 9.0-mos-452

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/426740

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/426764

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/433178

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/433180

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/7.0)

Reviewed: https://review.openstack.org/426764
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=77f1015b3ec2bb192ebd20a95dfbbdd00e4f29c5
Submitter: Jenkins
Branch: stable/7.0

commit 77f1015b3ec2bb192ebd20a95dfbbdd00e4f29c5
Author: Dmitry Guryanov <email address hidden>
Date: Fri Mar 25 16:19:29 2016 +0300

    Erase only boot code in MBR during node delete

    Zeroing first 1Mb of a block device will almost always
    lead to serios system error, if there is a filesystem
    on this block device and this filesystem is mounted.

    This code had 2 points: to prevent system from booting
    from hard drive after reboot and to prevent anaconda
    (or other OS installer) to automatically recognize
    disk structure. But since we only use image-based provision
    this is not the case anymore.

    It's enough to clear first 446 bytes of MBR where the first
    stage of a bootloader resides. So we will also be able to
    check nodes disks and filesystems after deleting it from
    cluster for troubleshooting for example, because partition
    scheme will not be touched.

    https://wiki.archlinux.org/index.php/Master_Boot_Record

    Change-Id: I60f2b250d6d5997b4b471c27ed00d057ae73c30a
    Fixes-bug: #1538587
    Fixes-bug: #1640014
    (cherry picked from commit e770d4ec7d302e958ffae8db87e633e9d5e3db91)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/433180
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=016be055f6c5f01642e4bf714035250f424bfb90
Submitter: Jenkins
Branch: stable/7.0

commit 016be055f6c5f01642e4bf714035250f424bfb90
Author: Dmitry Guryanov <email address hidden>
Date: Fri Apr 22 20:53:23 2016 +0300

    Don't use oflag=direct while erasing bootloader code

    We are erasing 446 bytes of bootloader code before rebooting
    node, but you can write only blocks with O_DIRECT, so let's
    write 446 bytes without oflag=direct and do sync after dd
    instead.

    Change-Id: I7b5f9749b170a9b7191991dd02cb867bba0e4c43
    Partial-Bug: #1573105
    Closes-Bug: #1538587
    (cherry picked from commit d25a0004d893505b0d3dad8fb09ef9be5a2a40bd)

tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on MOS 7.0 + mu7 updates.

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/8.0)

Reviewed: https://review.openstack.org/426740
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=9edd8a95767f5cf36e151aaf487405d7e66c4512
Submitter: Jenkins
Branch: stable/8.0

commit 9edd8a95767f5cf36e151aaf487405d7e66c4512
Author: Dmitry Guryanov <email address hidden>
Date: Fri Mar 25 16:19:29 2016 +0300

    Erase only boot code in MBR during node delete

    Zeroing first 1Mb of a block device will almost always
    lead to serios system error, if there is a filesystem
    on this block device and this filesystem is mounted.

    This code had 2 points: to prevent system from booting
    from hard drive after reboot and to prevent anaconda
    (or other OS installer) to automatically recognize
    disk structure. But since we only use image-based provision
    this is not the case anymore.

    It's enough to clear first 446 bytes of MBR where the first
    stage of a bootloader resides. So we will also be able to
    check nodes disks and filesystems after deleting it from
    cluster for troubleshooting for example, because partition
    scheme will not be touched.

    https://wiki.archlinux.org/index.php/Master_Boot_Record

    Change-Id: I60f2b250d6d5997b4b471c27ed00d057ae73c30a
    Fixes-bug: #1538587
    Fixes-bug: #1640014
    (cherry picked from commit e770d4ec7d302e958ffae8db87e633e9d5e3db91)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/433178
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=a47336af9d724fd4df4cfa12cd19df18698e29ec
Submitter: Jenkins
Branch: stable/8.0

commit a47336af9d724fd4df4cfa12cd19df18698e29ec
Author: Dmitry Guryanov <email address hidden>
Date: Fri Apr 22 20:53:23 2016 +0300

    Don't use oflag=direct while erasing bootloader code

    We are erasing 446 bytes of bootloader code before rebooting
    node, but you can write only blocks with O_DIRECT, so let's
    write 446 bytes without oflag=direct and do sync after dd
    instead.

    Change-Id: I7b5f9749b170a9b7191991dd02cb867bba0e4c43
    Partial-Bug: #1573105
    Closes-Bug: #1538587
    (cherry picked from commit d25a0004d893505b0d3dad8fb09ef9be5a2a40bd)

Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on MOS 8.0 + mu4 updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.