[fuel-agent]bootstrap boot failed beacause system-udevd timeout

Bug #1632592 reported by Wei Hui
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Wei Hui
Mitaka
Fix Released
High
Wei Hui
Newton
Fix Committed
High
Wei Hui

Bug Description

My bootstrap based on ubuntu lts-trusty.
Bootstrap often failed to startup on lenovo RD640 server, kernel complains about stucking for too long.
The log indicates system-udevd has lots of timeout events.
This bootstrap works fine with other DELL and INSPUR servers, and is only fails on the peculiar Lenovo ThinkerServer RD640.

[Steps to reproduce]
reboot a few times, occasionally it will fail

[Expected result]
boot normally and success

[Actual result]
boot failed and enter into systemd emergency shell

[Workaround]
I have tried adjust system-udevd timeout time, but it does not work.
One thing I notice is that my monitor print very slow, so I figure out it is related with kernel debug. The output of debug keeps cpu too busy, it does not has time to execute system-udevd commands.
I have tried delete "debug" kernel boot parameter and it works.
https://github.com/openstack/fuel-agent/blob/master/contrib/fuel_bootstrap/fuel_bootstrap_cli/fuel_bootstrap/settings.yaml.sample#L6

So the the problems is whether it is necessary to keep it. In what circumstance we need it.

Revision history for this message
Wei Hui (huiweics) wrote :
Wei Hui (huiweics)
description: updated
tags: added: area-library
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/386321

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Wei Hui (huiweics)
status: Confirmed → In Progress
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

Wei Hui, could you please cherry-pick this patch to stable/mitaka branch as well?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/386321
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=babe668c21283a412fc558d40b4f87fd33b69260
Submitter: Jenkins
Branch: master

commit babe668c21283a412fc558d40b4f87fd33b69260
Author: Wei Hui <email address hidden>
Date: Fri Oct 14 11:24:51 2016 +0800

    Delete kernel debug related parameter for performance

    Those deleted kernel parameters are extremely heavy debug.
    The boot process is rather slow on machines which has a
    complicate motherboard and lots of pci devices(Lenovo RD640).
    The debug process costs so much resouce, evetually it will impact
    normal functions, such as cpus are so busy dealing with debug
    info in kernelspace, they do not have time to reponse udev
    event in userspace and failed to bring up pci devices.

    Change-Id: I964c5fbc7f5b7d0ba6d1dd0166db3e4234179afa
    Closes-bug: #1632592

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/400165

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/400176

Revision history for this message
Wei Hui (huiweics) wrote :

@Vladimir Kozhukalov, cherry-pick Done.

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 10.0 → 11.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/mitaka)

Reviewed: https://review.openstack.org/400176
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=9eeba09d6c4ea20fc966032c9da9061890591319
Submitter: Jenkins
Branch: stable/mitaka

commit 9eeba09d6c4ea20fc966032c9da9061890591319
Author: Wei Hui <email address hidden>
Date: Mon Nov 21 18:41:45 2016 +0800

    Delete kernel debug related parameter for performance

    Those deleted kernel parameters are extremely heavy debug.
    The boot process is rather slow on machines which has a
    complicate motherboard and lots of pci devices(Lenovo RD640).
    The debug process costs so much resouce, evetually it will impact
    normal functions, such as cpus are so busy dealing with debug
    info in kernelspace, they do not have time to reponse udev
    event in userspace and failed to bring up pci devices.

    Change-Id: I964c5fbc7f5b7d0ba6d1dd0166db3e4234179afa
    Closes-bug: #1632592
    (cherry picked from commit babe668c21283a412fc558d40b4f87fd33b69260)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/newton)

Reviewed: https://review.openstack.org/400165
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=e950bee9ed70b1e69ee03f83cf2a9c565cdee5b3
Submitter: Jenkins
Branch: stable/newton

commit e950bee9ed70b1e69ee03f83cf2a9c565cdee5b3
Author: Wei Hui <email address hidden>
Date: Fri Oct 14 11:24:51 2016 +0800

    Delete kernel debug related parameter for performance

    Those deleted kernel parameters are extremely heavy debug.
    The boot process is rather slow on machines which has a
    complicate motherboard and lots of pci devices(Lenovo RD640).
    The debug process costs so much resouce, evetually it will impact
    normal functions, such as cpus are so busy dealing with debug
    info in kernelspace, they do not have time to reponse udev
    event in userspace and failed to bring up pci devices.

    Change-Id: I964c5fbc7f5b7d0ba6d1dd0166db3e4234179afa
    Closes-bug: #1632592
    (cherry picked from commit babe668c21283a412fc558d40b4f87fd33b69260)

tags: added: on-verification
Revision history for this message
Sergey Novikov (snovikov) wrote :

Verified on 9.2 snapshot #792

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-agent 11.0.0.0rc1

This issue was fixed in the openstack/fuel-agent 11.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.