The reason binding failed in your most recent neutron server log is not that the L2 agent appears dead as before. Instead, its because the L2 agent does not have a mapping for the physical_network 'physnet1'. See the log:
<<<And that network_type is 'vlan' and physical_network is 'physnet1' in the line above, and this is the only network segment available to bind.>>>
2013-11-05 23:51:55.755 16413 WARNING neutron.plugins.ml2.managers [-] NT-C2C97DA Failed to bind port 51904944-ab70-4cd4-893a-7a78b056523b on host wanghliu-10-7-0-153.sce.cn.ibm.com
The port cannot be bound because the L2 agent on the node does not have connectivity to the needed physical network. Most likely, bridge_mappings needs to be properly configured for the L2 agent on the compute node.
If so, I think Armando's tuning did address the issue, although I agree with him that a better liveness algorithm really is needed.
Hi Dazhao,
The reason binding failed in your most recent neutron server log is not that the L2 agent appears dead as before. Instead, its because the L2 agent does not have a mapping for the physical_network 'physnet1'. See the log:
2013-11-05 23:51:55.750 16413 DEBUG neutron. plugins. ml2.managers [-] NT-19C6DA8 Attempting to bind port 51904944- ab70-4cd4- 893a-7a78b05652 3b on host wanghliu- 10-7-0- 153.sce. cn.ibm. com bind_port /usr/lib/ python2. 6/site- packages/ neutron/ plugins/ ml2/managers. py:440 plugins. ml2.drivers. mech_agent [-] NT-53163AF Attempting to bind port 51904944- ab70-4cd4- 893a-7a78b05652 3b on network 2b5d5943- f201-4dda- 922a-679dcb4aa5 03 bind_port /usr/lib/ python2. 6/site- packages/ neutron/ plugins/ ml2/drivers/ mech_agent. py:57 plugins. ml2.drivers. mech_agent [-] NT-C9C8378 Checking agent: {'binary': u'neutron- openvswitch- agent', 'description': None, 'admin_state_up': True, 'heartbeat_ timestamp' : datetime. datetime( 2013, 11, 6, 5, 51, 52), 'alive': True, 'topic': u'N/A', 'host': u'wanghliu- 10-7-0- 153.sce. cn.ibm. com', 'agent_type': u'Open vSwitch agent', 'created_at': datetime. datetime( 2013, 11, 4, 5, 29, 54), 'started_at': datetime. datetime( 2013, 11, 6, 5, 48, 56), 'id': u'b7045650- d7e4-43cb- b10d-f725877c73 e9', 'configurations': {u'tunnel_types': [], u'tunneling_ip': u'', u'bridge_mappings': {}, u'l2_population': False, u'devices': 0}} bind_port /usr/lib/ python2. 6/site- packages/ neutron/ plugins/ ml2/drivers/ mech_agent. py:59
2013-11-05 23:51:55.751 16413 DEBUG neutron.
2013-11-05 23:51:55.754 16413 DEBUG neutron.
<<<Note that bridge_mappings in the line above is an empty list.>>>
2013-11-05 23:51:55.755 16413 DEBUG neutron. plugins. ml2.drivers. mech_openvswitc h [-] NT-9EF951A Checking segment: {'segmentation_id': 1001L, 'physical_network': u'physnet1', 'id': u'b04b63b4- 80db-4a22- 80db-463ff42628 97', 'network_type': u'vlan'} for mappings: {} with tunnel_types: [] check_segment_ for_agent /usr/lib/ python2. 6/site- packages/ neutron/ plugins/ ml2/drivers/ mech_openvswitc h.py:48
<<<And that network_type is 'vlan' and physical_network is 'physnet1' in the line above, and this is the only network segment available to bind.>>>
2013-11-05 23:51:55.755 16413 WARNING neutron. plugins. ml2.managers [-] NT-C2C97DA Failed to bind port 51904944- ab70-4cd4- 893a-7a78b05652 3b on host wanghliu- 10-7-0- 153.sce. cn.ibm. com
The port cannot be bound because the L2 agent on the node does not have connectivity to the needed physical network. Most likely, bridge_mappings needs to be properly configured for the L2 agent on the compute node.
If so, I think Armando's tuning did address the issue, although I agree with him that a better liveness algorithm really is needed.