nova-* services are marked as down after MySQL / RabbitMQ failures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Fix Released
|
Critical
|
Roman Podoliaka | ||
7.0.x |
Invalid
|
Critical
|
Roman Podoliaka | ||
8.0.x |
Fix Released
|
Critical
|
Roman Podoliaka |
Bug Description
Upstream bugs:
https:/
=======
Fuel version info (8.0 build #169): http://
Nova compute service (1 of 2 instances in the cloud) was marked as down after MySQL termination on controllers (one by one):
Check that required services are running failure
fuel_health.
File "/usr/lib/
u'XXX' not in output, 'Step 2 failed: Some nova services '
File "/usr/lib/
self.
File "/usr/lib/
raise self.failureExc
AssertionError: Step 2 failed: Some nova services have not been started.. Please refer to OpenStack logs for more details.
Binary Host Zone Status State Updated_At
nova-cert node-2.
nova-consoleauth node-2.
nova-scheduler node-2.
nova-conductor node-2.
nova-cert node-3.
nova-consoleauth node-3.
nova-scheduler node-3.
nova-conductor node-3.
nova-cert node-4.
nova-consoleauth node-4.
nova-scheduler node-4.
nova-conductor node-4.
nova-compute node-1.
nova-compute node-5.
Steps to reproduce:
1. Terminate mysql on controller nodes
2. Wait while it is being restarted
3. Verify it is restarted
4. Go to another controller (kill mysql daemon, wait until it's recovered by pacemaker, go to the next controller, kill mysql daemon ...)
5. Run OSTF
Expected result: all health checks passed
Actual: test 'Check that required services are running' failed
Here is a part of nova-compute log on node-5:
2015-11-17 06:29:18.560 8431 ERROR oslo.service.
unication packet', system error: 0") [SQL: u'SELECT 1']
http://
The service was marked as down (up to one hour) until I restarted nova-compute on node-5.
tags: | added: area-mos |
tags: | added: swarm-blocker |
Changed in fuel: | |
status: | Incomplete → Confirmed |
description: | updated |
summary: |
- nova-compute is marked as down after MySQL failover + nova-* services are marked as down after MySQL failover |
Changed in fuel: | |
assignee: | MOS Nova (mos-nova) → Roman Podoliaka (rpodolyaka) |
description: | updated |
summary: |
- nova-* services are marked as down after MySQL failover + nova-* services are marked as down after MySQL / RabbitMQ failures |
tags: | added: on-verification |
no longer affects: | fuel |
Changed in mos: | |
status: | Fix Released → New |
Artem, have you checked Galera status?
2015-11-17 06:29:18.560 8431 ERROR oslo.service. loopingcall RemoteError: Remote error: DBConnectionError (_mysql_ exceptions. OperationalErro r) (2013, "Lost connection to MySQL
unication packet', system error: 0") [SQL: u'SELECT 1']
means that oslo.db did pessimistic connection check (SELECT 1 at the beginning of each new transaction) and found out that connection was invalidated. 2013 returned by Galera means that it's currently not available for clients (most likely it's in the middle of a cluster re-assembly).