I've checked on the problem env while it was alive. The problem is caused by /usr/bin/neutron-l3-agent - it got stuck: 2014-07-07 12:27:11.870 22389 INFO neutron.agent.l3_agent [req-da7de6b8-11bb-436c-a4a6-039980babbe2 None] L3 agent started 2014-07-07 12:27:13.402 22389 ERROR neutron.openstack.common.rpc.common [req-da7de6b8-11bb-436c-a4a6-039980babbe2 None] Failed to consume message from queue: (0, 0): (541) INTERNAL_ERROR 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common Traceback (most recent call last): 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 594, in ensure 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common return method(*args, **kwargs) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 672, in _consume 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common queues_tail.consume(nowait=False) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 194, in consume 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common self.queue.consume(*args, callback=_callback, **options) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 611, in consume 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common nowait=nowait) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1787, in basic_consume 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common (60, 21), # Channel.basic_consume_ok 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 67, in wait 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common self.channel_id, allowed_methods) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 270, in _wait_method 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common self.wait() 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 69, in wait 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common return self.dispatch_method(method_sig, args, content) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 87, in dispatch_method 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common return amqp_method(self, args) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 526, in _close 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common (class_id, method_id), ConnectionError) 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common InternalError: (0, 0): (541) INTERNAL_ERROR 2014-07-07 12:27:13.402 22389 TRACE neutron.openstack.common.rpc.common 2014-07-07 12:27:13.410 22389 INFO neutron.openstack.common.rpc.common [req-da7de6b8-11bb-436c-a4a6-039980babbe2 None] Reconnecting to AMQP server on 192.168.0.2:5673 2014-07-07 12:27:13.410 22389 INFO neutron.openstack.common.rpc.common [req-da7de6b8-11bb-436c-a4a6-039980babbe2 None] Delaying reconnect for 5.0 seconds... 2014-07-07 12:27:18.542 22389 WARNING neutron.openstack.common.loopingcall [req-da7de6b8-11bb-436c-a4a6-039980babbe2 None] task run outlasted interval by 4.557386 sec Due to this it was not abel to assign floating IPs properly (no firewall rules were applied in the namespace, etc). Crm did not detect this problem since we check for process and PID in ocf script for l3-agent. As soon as I killed neutron-l3-agent process, crm re-started it and floating IPs started to work (all the needed iptables rules were applied). Maybe, a better neutron-l3-agent pacemaker monitor action can help here (in ocf script)? I've tried to simulate this problem with "killall -STOP neutron-l3-agent" and it looks like "neutron agent-list" is able to detect the problem: +--------------------------------------+--------------------+--------+-------+----------------+ | id | agent_type | host | alive | admin_state_up | +--------------------------------------+--------------------+--------+-------+----------------+ | 310e9051-abd2-400d-b7ff-6a8d8fc8f7c0 | Open vSwitch agent | node-2 | :-) | True | | 3a53462c-2cc2-4323-a5be-95ed9703192f | L3 agent | node-2 | xxx | True | | 6170cfcf-9749-42cb-9af9-a3cbb5d1f30d | Metadata agent | node-3 | :-) | True | | 6be1684d-61e0-4a81-924b-eca0b1c46d1d | Open vSwitch agent | node-4 | :-) | True | | 72853490-b201-4c89-a0d0-61c90e19789b | Open vSwitch agent | node-3 | :-) | True | | 80ee1dc9-2cc3-4d29-8380-e641ed0c76ba | Metadata agent | node-1 | :-) | True | | 81712e3f-df75-487c-9af2-d9933e200cbf | DHCP agent | node-3 | :-) | True | | b0ab3dd5-db8f-47ea-aa0f-8b10871bd311 | Open vSwitch agent | node-1 | :-) | True | | e9367b33-b07a-4fec-b03c-0b818fc050b0 | Metadata agent | node-2 | :-) | True | +--------------------------------------+--------------------+--------+-------+----------------+