supervisorctl restarts docker containers
Bug #1319076 reported by
Sergii Golovatiuk
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
Critical
|
Matthew Mosesohn | ||
5.0.x |
Fix Released
|
Critical
|
Matthew Mosesohn |
Bug Description
When environment is deployed with
sh "utils/
--group=setup causes side effect on master node. It completely breaks cobbler container and supervisorctl tries to restart it constantly
How to reproduce
Set group as "setup" and deploy environment
docker ps -a
There will be stopped or 1-2 minute running cobbler container
Also if you VNC any slave node you will see that it can't get IP and "bo boot media found"
Changed in fuel: | |
assignee: | nobody → Sergii Golovatiuk (sgolovatiuk) |
assignee: | Sergii Golovatiuk (sgolovatiuk) → Alexander Didenko (adidenko) |
importance: | Undecided → High |
Changed in fuel: | |
milestone: | none → 5.0 |
description: | updated |
Changed in fuel: | |
importance: | Medium → High |
assignee: | Matthew Mosesohn (raytrac3r) → Alexander Didenko (adidenko) |
status: | Triaged → In Progress |
Changed in fuel: | |
importance: | High → Critical |
no longer affects: | fuel/5.1.x |
Changed in fuel: | |
milestone: | 5.0 → 5.1 |
To post a comment you must log in.
This issue occurs only when we run systests with --group=setup. Like this:
sh "utils/ jenkins/ system_ tests.sh" -t test -w $(pwd) -j "fuelweb_test" -i "$ISO_PATH" -o --group=setup
After starting fuel virtual server we're getting broken cobbler container which leads to "Unable to find boot device" on openstack virtual nodes.
We can see the following in /var/log/ docker- cobbler. log
dnsmasq-dhcp: read /etc/ethers - 0 addresses cobblerd" , line 76, in main api.BootAPI( is_cobblerd= True) python2. 6/site- packages/ cobbler/ api.py" , line 130, in __init__ deserialize( ) python2. 6/site- packages/ cobbler/ api.py" , line 898, in deserialize deserialize( ) python2. 6/site- packages/ cobbler/ config. py", line 266, in deserialize modules. conf" % item.collection _type() ) modules. conf'
Traceback (most recent call last):
File "/usr/bin/
api = cobbler_
File "/usr/lib/
self.
File "/usr/lib/
return self._config.
File "/usr/lib/
raise CX("serializer: error loading collection %s. Check /etc/cobbler/
CX: 'serializer: error loading collection profile. Check /etc/cobbler/
But we can't reproduce it if we deploy Fuel in a normal (manual) way.
This may be related to the snapshot/revert mechanism we use in our system tests. For example: nailgun container is ready and "curl http:// 127.0.0. 1:8000/ api/version" works fine, systemtest scripts see "Fuel node deployment complete!" and they snapshot fuel VM. But cobbler container may be still running and we catch it in the middle of something. So after revert is gets messed up completely. The following commands fix this:
supervisorctl stop docker-cobbler
docker ps -a | grep cobbler | awk '{print $1}' | xargs docker rm -f
supervisorctl start docker-cobbler
So, maybe we should use "dockerctl check" in /usr/local/ sbin/bootstrap_ admin_node. sh instead of "curl http:// 127.0.0. 1:8000/ api/version" in order to determine if Fuel node deployment is complete?
Please also note that currently "dockerctl check" gives exit code 0 even if some containers are in "ERROR" state. So we should either make it "exit 1", or grep output from "dockerctl check"