[SRU] Using SSL with rabbitmq prevents communication between nova-compute and conductor after latest nova updates
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned | ||
oslo.messaging |
Invalid
|
Undecided
|
Unassigned | ||
python-amqp (Ubuntu) |
Fix Released
|
High
|
Edward Hope-Morley | ||
Trusty |
Fix Released
|
High
|
Edward Hope-Morley |
Bug Description
[Impact]
Current oslo.messaging and python-amqp results in repeated connection timeouts in the amqp transport layer (SSLError) and thus excessive reconnect attempts. This is a known issues that was fixed in python-amqp 1.4.4.
[Test Case]
Deploy openstack using current Trusty versions + this version of python-amqp + rabbitmq configured to allow ssl connections only. Once up and running, check the following:
- number of rabbitmq connections - with single nova-compute, conductor etc I see approx 20 connections whereas previously i saw well over 100 and rising.
sudo rabbitmqctl list_connections
- check that messages are being consumed from openstack queues
sudo rabbitmqctl list_queues -p openstack consumers messages name
- also check e.g. nova-compute and nova-conductor logs and verify that the erros menioned below no longer appear
[Regression Potential]
None.
[Other Info]
None.
---- ---- ----- ----
On the latest update of the Ubuntu OpenStack packages, it was discovered that the nova-compute/
When this problem occurs, the compute node cannot connect to the controller, and this message is constantly displayed:
WARNING nova.conductor.api [req-4022395c-
Investigation revealed that having rabbitmq configured with SSL was the root cause of this problem. This seems to have been introduced with the current version of the nova packages. Rabbitmq was not updated as part of this distribution update, but the messaging library (python-
Versions installed:
Openstack version: Icehouse
Ubuntu 14.04.2 LTS
nova-conductor 1:2014.
nova-compute 1:2014.
rabbitmq-server 3.2.4-1
openssl:
Related branches
- Ubuntu branches: Pending requested
-
Diff: 82 lines (+64/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/dont-disconnect-transport-on-ssl-read-timeout.patch (+56/-0)
debian/patches/series (+1/-0)
Changed in oslo.messaging: | |
status: | Confirmed → Invalid |
Changed in nova: | |
status: | New → Invalid |
Changed in python-oslo.messaging (Ubuntu): | |
status: | New → Confirmed |
tags: | added: sts |
description: | updated |
summary: |
- Using SSL with rabbitmq prevents communication between nova-compute and - conductor after latest nova updates + [SRU] Using SSL with rabbitmq prevents communication between nova- + compute and conductor after latest nova updates |
Changed in python-amqp (Ubuntu Trusty): | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Edward Hope-Morley (hopem) |
description: | updated |
Changed in python-amqp (Ubuntu): | |
status: | In Progress → Fix Released |
Upgraded nova-compute to 1:2014.1.5-0ubuntu1
The logs show more details:
2015-07-08 11:32:29.066 13437 INFO oslo.messaging. _drivers. impl_rabbit [-] Connected to AMQP server on amqp.wedgecnd. internal: 5672 0e37-4114- 919f-f3e5e164d7 24 None None] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? _drivers. impl_rabbit [-] Failed to consume message from queue: <AMQPError: unknown error> _drivers. impl_rabbit Traceback (most recent call last): _drivers. impl_rabbit File "/usr/lib/ python2. 7/dist- packages/ oslo/messaging/ _drivers/ impl_rabbit. py", line 624, in ensure _drivers. impl_rabbit return method(*args, **kwargs) _drivers. impl_rabbit File "/usr/lib/ python2. 7/dist- packages/ oslo/messaging/ _drivers/ impl_rabbit. py", line 704, in _consume _drivers. impl_rabbit raise self.connection .recoverable_ connection_ errors[ 0] _drivers. impl_rabbit RecoverableConn ectionError: <AMQPError: unknown error> _drivers. impl_rabbit _drivers. impl_rabbit [-] Reconnecting to AMQP server on amqp.wedgecnd. internal: 5672 _drivers. impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2015-07-08 11:32:38.062 13437 WARNING nova.conductor.api [req-9f555cd9-
2015-07-08 11:32:38.069 13437 ERROR oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 TRACE oslo.messaging.
2015-07-08 11:32:38.069 13437 INFO oslo.messaging.
2015-07-08 11:32:38.069 13437 INFO oslo.messaging.