[ubuntu_bootstrap] There is a chance to get slaves bootstrapped with broken kernel parameters
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
Low
|
Aleksey Zvyagintsev |
Bug Description
When I deploy Fuel on bare-metal I provision master node using PXE/NFS and don't access it via IPMI each time. So in order to check if it's ready or not I log in to master via SSH and try to execute `fuel nodes` and `docker ps` commands. If they work as expected I used to boot slave nodes. And today I've got the following issue:
1. slave nodes are bootstrapped and discovered in nailgun
2. after I created new env and added those nodes, I started network check and it fails
3. if I select all nodes and try to configure interfaces I can't save changes, GUI prints error
I found the root cause of networking issues - some nodes have admin/pxe interface named as 'eno1', but some have 'eth0' old naming. I checked kernel parameters and found that 'net.ifnames=1' is missing:
initrd=
Then I checked cobbler profiles and made sure 'ubuntu_bootstrap' has net.ifnames=1 in its kernel options. I rebooted the slaves and all of them have predictable interfaces names (enoX) now. But nodes interfaces weren't updated in Naigun, so I had to remove them manually from DB.
After discussion with @azvyagintsev we found that first time I booted slaves too early, cobbler already had profile for ubuntu_bootstrap, but it wasn't configured properly. It sounds like I broke the rules and made something wrong, but Fuel let me do that :). I had slave nodes which aren't marked as 'error', can't be added to environment and deployed. So IMHO this issue leads to bad UX.
description: | updated |
Changed in fuel: | |
status: | New → Confirmed |
tags: | added: area-library tech-debt |
Changed in fuel: | |
assignee: | nobody → Fuel Library Team (fuel-library) |
Changed in fuel: | |
milestone: | 9.0 → 8.0 |
tags: | added: area-docs |
tags: | removed: area-library tech-debt |
Changed in fuel: | |
status: | Fix Committed → Fix Released |
This issue related to current architecture restriction.
Problem appears only with flow:
1)Deprecated Centos-bootstrap active
(in bug- looks like Artem run HW before fuel-master installation has been completed. (while ubuntu-bootstrap has not been activated yet ))
2)User run node and assign it to Fuel Openstack Env \ or start\reset deployment (any case when fuel store node and create's cobbler system)
2.1)
cobbler system report --name default |grep -i Profile
Profile : bootstrap
3)Fuel(cobbler) creates system, where centos
cobbler system report --name node-1
...
Profile : bootstrap (centos-bootstrap)
...
4)Then user changes active bootstrap, astute change cobbler default Profile to 'ubuntu-bootstrap'
cobbler system report --name default |grep -i Profile ubuntu_ bootstrap
Profile : ubuntu_bootstrap
But stored system still use old bootstrap.
Possible soluthion:
1)Remove node from db , reboot , and re-discover it
2) Manually update profile:
cobbler system edit --name node-1 --profile=
[root@nailgun log]# cobbler system report --name node-1 |grep Profile
Profile : ubuntu_bootstrap
Both cases will be covered in documentation.