Fuel for OpenStack

Some nodes don't boot over pxe after cluster deletion

Bug #1319869 reported by Andrey Sledzinskiy on 2014-05-15

14

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Released	High	Dmitry Ilyin	Fuel for OpenStack 5.0

Bug Description

reproduced on {"build_id": "2014-05-14_01-10-31", "mirantis": "yes", "build_number": "203", "ostf_sha": "ef970b442437072bdfa4ea99a7b2971215b2de18", "nailgun_sha": "155acff248aed9a8295d03d58346daa4851d49b4", "production": "docker", "api": "1.0", "fuelmain_sha": "9df792985f8984063979f16dc94b4df24ef40c2d", "astute_sha": "80e60b66e3cb4e3e61b22c61c4acfa127ba1bf7e", "release": "5.0", "fuellib_sha": "89ccab7ee76980e38c4d9a5fbcdf7df87e35d61f"}

Steps:
1. Create next cluster - Ubuntu, HA, KVM, Neutron GRE, Ceph for images
2. Add 3 controllers, 1 compute, 1 cinder, 3 ceph nodes
3. Deploy cluster
4. After successful deployment delete cluster

Actual result - some nodes failed to boot over pxe with error http://ipxe.org/err/040ee119 (see attached screen)

Snapshot is attached

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-15:

#1

Afterdeletionerrorpng Edit (916.3 KiB, application/octet-stream)

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-15:

#2

fuel-snapshot-2014-05-15_14-40-35.tgz Edit (70.3 MiB, application/x-tar)

Nastya Urlapova (aurlapova) on 2014-05-15

Changed in fuel:
assignee:	nobody → Fuel Library Team (fuel-library)

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-05-15:

#3

this looks like related to https://bugs.launchpad.net/fuel/+bug/1317213 as Andrey's environment should have dhcrelay performance problems. may be, we could overcome this by introducing fuzzy delay into node rebooting task in nailgun.

Changed in fuel:
assignee:	Fuel Library Team (fuel-library) → Fuel Python Team (fuel-python)

Mike Scherbakov (mihgen) on 2014-05-16

Changed in fuel:
assignee:	Fuel Python Team (fuel-python) → Dmitry Ilyin (idv1985)

Revision history for this message

Dmitry Ilyin (idv1985) wrote on 2014-05-16:

#4

On KVM i never had any problem with booting 5 or even more nodes simultaniously as before docker+dhclprelay was introduced as with it never having problems with dhcp server performance.

On VBOX booting more then one node hangs every time and it was so even before docker.

Perhaps it also has something to do with the version of KVM and ipxe of our systems.

Revision history for this message

Dmitry Ilyin (idv1985) wrote on 2014-05-16:

#5

It's also possible that host records were not deleted from cobbler when the evironment was deleted. If you can reproduce this please check in cobber web ui that these nodes have their host profile records removed and they would recieve the blue menu.

Revision history for this message

Dmitry Ilyin (idv1985) wrote on 2014-05-16:

#6

We found out that this is most likly caused by starting slave nodes when either dhcp server on master node or dhcrelay is not ready yet. We decided to insert dhcpcheck to determine that master node is ready before starting the slaves.

Bogdan Dobrelya (bogdando) on 2014-05-19

Changed in fuel:
status:	New → Triaged

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-05-20:

#7

https://review.openstack.org/#/c/93937/

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-05-20:

#8

https://review.openstack.org/#/c/94232

Changed in fuel:
status:	Triaged → Fix Committed

Revision history for this message

Egor Kotko (ykotko) wrote on 2014-05-22:

#9

Verified on:
{"build_id": "2014-05-20_01-10-31", "mirantis": "yes", "build_number": "213", "ostf_sha": "353f918197ec53a00127fd28b9151f248a2a2d30", "nailgun_sha": "ab7f7dfddadfe0e08a39693c6d33aa0250f20142", "production": "docker", "api": "1.0", "fuelmain_sha": "68c62519bc788fd8ff27e4576a6cdf7e7fac14c0", "astute_sha": "a3432e6e31ffd6f1c56386b2eb54afeacb74750b", "release": "5.0", "fuellib_sha":
"3d92142a5643af82596f0450e39282550a45e5db"}

Changed in fuel:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.