Bare-metal: CentOS provisioning hangs on DELL servers

Bug #1312671 reported by Aleksandr Didenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Medium
Aleksandr Didenko

Bug Description

{"build_id": "2014-04-25_01-00-26", "mirantis": "yes", "build_number": "139", "nailgun_sha": "6b156d7434d0a70efe288cd8064d0ca90fed0737", "production": "prod", "ostf_sha": "134765fcb5a07dce0cd1bb399b2290c988c3c63b", "fuelmain_sha": "9939570720729dce22e635b0b9c9465e9d80661f", "astute_sha": "6e8fa4cc12968d7b468fc590b2f06bb59bf74511", "release": "5.0", "fuellib_sha": "ebc4b365dd60811e918c280a79631369a7afb868"}

Steps to reproduce:

1. Create any CentOS environment in Fuel
2. Click "Deploy changes"

Result:
Nodes hang in provisiongin state untill deployment times out.

On IPMI console I can see the following:

1. Kernel load hangs on "Switching to clocksource tsc" for ~10 minutes (screen shot: pre-kernel-bug.png)
2. Kernel bug appears and the system hangs (screen shot: kernel-bug.png)

It's similar to the bug described here: http://bugs.centos.org/view.php?id=6492

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

screen shot: pre-kernel-bug.png

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

screen shot: kernel-bug.png

description: updated
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

moving as high priority to 5.1

tags: added: release-notes
Changed in fuel:
milestone: 5.0 → 5.1
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Adding the following kernel options fixes this issue:

ipmi_si.tryacpi=0 ipmi_si.trydmi=0 ipmi_si.trydefaults=0

But I'm hitting another one, system hangs with the following messages on console: screen shot kernel_swapper_hang.png

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Set lower priority because there is a workaround

Changed in fuel:
importance: Critical → Medium
status: New → Confirmed
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

It's CentOS-6.5 kernel bug http://bugs.centos.org/view.php?id=6492

How to work around it in Fuel:

Just run the following commands on a Fuel master node:

cobbler profile edit --name centos-x86_64 --kopts="ipmi_si.tryacpi=0 ipmi_si.trydefaults=0 ipmi_si.trydmi=0" --in-place
cobbler profile edit --name centos-x86_64 --kopts-post="ipmi_si.tryacpi=0 ipmi_si.trydefaults=0 ipmi_si.trydmi=0" --in-place

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Mike Scherbakov (mihgen) wrote :

This information has to be added to user documentation I believe. Also, we added a field in settings tab where you can provide kernel options.

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

> Also, we added a field in settings tab where you can provide kernel options.

Unfortunately it won't help us to pass through provisioning stage, because those custom kernel options, that we specify in Fuel UI, do not affect default cobbler profile "centos-x86_64". So nodes still get stuck when they try to boot "centos-x86_64" from cobbler and we still need to run 'cobbler profile edit --name centos-x86_64 --kopts="ipmi_si.tryacpi=0 ipmi_si.trydefaults=0 ipmi_si.trydmi=0" --in-place' command in order to workaround this problem.

Changed in fuel:
milestone: 5.1 → 5.0
status: Invalid → Triaged
milestone: 5.0 → 5.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-docs (master)

Fix proposed to branch: master
Review: https://review.openstack.org/93882

Changed in fuel:
status: Triaged → In Progress
Revision history for this message
Meg McRoberts (dreidellhasa) wrote :

I added a brief note to the Known Issues in the 5.0 Release Notes but need clarification about what should be said. Should Alexander's work around commands be included in the Release Notes or in the instructions for deploying OpenStack? Or do these actually solve the problem?

Revision history for this message
Aleksandr Didenko (adidenko) wrote :
Changed in fuel:
status: In Progress → Fix Released
Revision history for this message
Meg McRoberts (dreidellhasa) wrote :

Listed as "Known Issue" in 5.0.1 Release Notes. I combined the two notes that were in the 5.0 Release Notes for this issue..

tags: added: scale
Revision history for this message
Leontii Istomin (listomin) wrote :

Reproduced the issue with 7.0-98.
2 from 203 nodes was hanged with the same lines (screenshot is attached)

[root@bootstrap ~]# uname -a
Linux bootstrap 3.10.55-1.el6.mos5.x86_64 #1 SMP Tue Jun 30 13:21:52 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@bootstrap ~]# cat /etc/issue
CentOS release 6.5 (Final)
Kernel \r on an \m

configuration of nodes:
http://paste.openstack.org/show/414459/
http://paste.openstack.org/show/414460/

After restarting the nodes, it booted as usual, without the issue

Revision history for this message
Leontii Istomin (listomin) wrote :

please ignore comment #14. I created another bug for this case https://bugs.launchpad.net/fuel/+bug/1485019

tags: removed: scale
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.