Deploy new compute node caused whole cluster Failed with Err status in fuel web
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
High
|
JohnsonYi |
Bug Description
Deployed environment:
Mos 8.0
3 controllers
2 compute+Cinder with LVM
1 ironic
This environment is ready for weeks.
What I do is to deploy a new compute node(compute+
Logs:
2016-03-11 02:22:20 +0000 Puppet (debug): Executing '/usr/bin/
2016-03-11 02:22:20 +0000 Puppet (debug): Executing '/usr/bin/apt-get -q -y -o DPkg::Options:
2016-03-11 02:22:20 +0000 Puppet (err): Execution of '/usr/bin/apt-get -q -y -o DPkg::Options:
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
python-nova : Depends: websockify (>= 0.6.1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
/usr/lib/
/usr/lib/
/usr/lib/
/usr/lib/
/usr/lib/
/etc/puppet/
root@node-9:~# apt-get -q -y -o DPkg::Options:
Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
python-nova : Depends: websockify (>= 0.6.1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
root@node-9:~# apt-cache policy websockify
websockify:
Installed: (none)
Candidate: 0.6.1+dfsg1-
Version table:
0.
1000 http://
0.
500 http://
0.5.1+dfsg1-3 0
500 http://
root@node-9:~# dpkg -l | grep websockify
root@node-9:~#
I use a local ubuntu & mos repo, no connective issue.
Updates:
2016-05-17
Today I emulated this issue by disable dependence package python-numpy for websockify from local ubuntu repo:
root@node-3:~# apt-cache policy python-numpy
python-numpy:
Installed: 1:1.8.2-0ubuntu0.1
Candidate: 1:1.8.2-0ubuntu0.1
Version table:
*** 1:1.8.2-0ubuntu0.1 0
500 http://
100 /var/lib/
1:
500 http://
mv python-
mv python-
then deploy a new compute node;
Install python-nova failed as expect:
Get:90 http://
Get:91 http://
Fetched 12.3 MB in 0s (26.7 MB/s)
E: Failed to fetch http://
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
/usr/lib/
/usr/lib/
/usr/lib/
/usr/lib/
new compute node was marked as "Err", then all controller & other nodes go through the puppet jobs and done smoothly, then all nodes marked as "Err" a bit later which exactly the same as this bug reported.
then restore the package python-numpy, and deploy again, the controller go through the puppet scripts again&again..., stuck at 11%(openstack was not redeployed) almost 1 hour and a half later, all nodes restored to ready status,
So the openstack environment was not actually redeployed and vm creation was still working. I may have misunderstanding for controller became "Deploy" status. But the robustness still may be a problem, as deploy new node may impact the whole environment, potential risk still exist.
Changed in fuel: | |
assignee: | nobody → Sachin Yede (yede-sachin45) |
Changed in fuel: | |
importance: | Undecided → High |
Changed in fuel: | |
status: | Incomplete → Confirmed |
Changed in fuel: | |
assignee: | Sachin Yede (yede-sachin45) → nobody |
Changed in fuel: | |
assignee: | nobody → MOS Linux (mos-linux) |
Changed in fuel: | |
assignee: | MOS Maintenance (mos-maintenance) → Rodion Tikunov (rtikunov) |
Please attach diagnostic snapshot.