[ovs] Binding failed for port

Bug #1763146 reported by Annie Melen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Unassigned

Bug Description

Hello!

I've already had successful deployment on Pike (3 control node + several compute nodes):
 - ML2 plugin - openvswitch
 - default tunnel type - vxlan
 - dvr is enabled

Instances were started, running, migrating without errors.
Everything was fine until I've upgraded to Queens... I've use same neutron configuration files and encountered the following error:

...
2018-04-11 12:21:19.128 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_bridge [-] Bridge br-int has datapath-ID 00005618b7026f46
2018-04-11 12:21:23.460 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] Mapping physical network external to bridge br0
2018-04-11 12:21:23.948 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_bridge [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] Bridge br0 has datapath-ID 0000ec0d9a7abceb
2018-04-11 12:21:24.206 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ovs_bridge [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] Bridge br-tun has datapath-ID 0000d2588392e746
2018-04-11 12:21:24.220 36774 INFO neutron.agent.agent_extensions_manager [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] Initializing agent extension 'qos'
2018-04-11 12:21:24.323 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_dvr_neutron_agent [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] L2 Agent operating in DVR Mode with MAC FA-16-3F-7C-00-B2
2018-04-11 12:21:24.375 36774 INFO neutron.common.ipv6_utils [req-ac08fa6c-7257-4296-aede-67a051568440 - - - - -] IPv6 not present or configured not to bind to new interfaces on this system. Please ensure IPv6 is enabled and /proc/sys/net/ipv6/conf/default/disable_ipv6 is set to 0 to enable IPv6.
2018-04-11 12:21:25.275 36774 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-1c05ff93-84cb-45ed-937c-95f048b553f1 - - - - -] Agent initialized successfully, now running...
...
2018-04-11 21:24:10.248 3953 INFO neutron.agent.common.ovs_lib [req-64bf5e4c-32a9-4936-93cd-2658095b2d35 - - - - -] Port b8b42046-14ba-4b43-a24c-3a0a1b350aea not present in bridge br-int
2018-04-11 21:24:10.249 3953 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-64bf5e4c-32a9-4936-93cd-2658095b2d35 - - - - -] port_unbound(): net_uuid None not managed by VLAN manager
2018-04-11 21:24:10.249 3953 INFO neutron.agent.securitygroups_rpc [req-64bf5e4c-32a9-4936-93cd-2658095b2d35 - - - - -] Remove device filter for ['b8b42046-14ba-4b43-a24c-3a0a1b350aea']
...

It happens every time I try to launch instance. No matter, internal or external (vlan provider) network, result is the same.

So my question is 'What I'm doing wrong?' May be I missed something in Queens config samples, or encountered the huge bug, or what?..

Pike Environment
------------------
Ubuntu 16.04.4 LTS, 4.4.0-119-generic
Neutron 11.0.3-0ubuntu1.1~cloud0
openvswitch-switch 2.8.1

Queens Environment
------------------
Ubuntu 16.04.4 LTS, 4.4.0-119-generic
Neutron 12.0.0-0ubuntu2~cloud0
openvswitch-switch 2.9.0

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Can You check what errors You have in neutron-server logs about such port?

Revision history for this message
Miguel Lavalle (minsel) wrote :

Some questions:

1) How are you launching the instance? CLI command? If yes, can you share it?

2) The agent is giving you the port's id. What is the status of that port? Can you do a port show with that id?

Revision history for this message
Annie Melen (anniemelen) wrote :
Download full text (5.9 KiB)

Slawek, Miguel,
first of all, thanks for you help!

I try to create instance in InternalNet:

root@control01:~# openstack network show InternalNet
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | UP |
| availability_zone_hints | |
| availability_zones | nova |
| created_at | 2018-04-11T16:30:18Z |
| description | |
| dns_domain | None |
| id | fdde9f33-b0fc-4b81-b870-ae4467d97f3d |
| ipv4_address_scope | None |
| ipv6_address_scope | None |
| is_default | None |
| is_vlan_transparent | None |
| mtu | 1450 |
| name | InternalNet |
| port_security_enabled | True |
| project_id | 87294dc43b3e47e2971f1061588c9417 |
| provider:network_type | vxlan |
| provider:physical_network | None |
| provider:segmentation_id | 65585 |
| qos_policy_id | None |
| revision_number | 3 |
| router:external | Internal |
| segments | None |
| shared | True |
| status | ACTIVE |
| subnets | 8912cd5c-dff7-4a32-93be-cd985d4f1ff7 |
| tags | |
| updated_at | 2018-04-11T16:30:24Z |
+---------------------------+--------------------------------------+
root@control01:~# openstack server create --flavor Main-1-1 --image Cirros-0.4.0-x86_64 --network InternalNet imdyinghere
+-------------------------------------+------------------------------------------------------------+
| Field | Value |
+-------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | ...

Read more...

Revision history for this message
Annie Melen (anniemelen) wrote :
Download full text (12.0 KiB)

I've inspected neutron-server log before attachment, and now I'm really confused with this error

 - 2018-04-12 10:41:23.046 17106 DEBUG neutron.db.l3_dvrscheduler_db [req-5c074790-6feb-487c-b4fd-7ca9243ed8e2 b3a9a7b917cb4972a4ec4a8d4dafe49c 87294dc43b3e47e2971f1061588c9417 - default default] No DVR routers for this DVR port 12a91c8a-b2ca-458a-b762-b695b07e2475 on host compute01-api get_dvr_routers_to_remove /usr/lib/python2.7/dist-packages/neutron/db/l3_dvrscheduler_db.py:191
2018-04-12 10:41:23.083 17106 WARNING neutron.plugins.ml2.drivers.l2pop.mech_driver [req-5c074790-6feb-487c-b4fd-7ca9243ed8e2 b3a9a7b917cb4972a4ec4a8d4dafe49c 87294dc43b3e47e2971f1061588c9417 - default default] Unable to retrieve active L2 agent on host compute01-api

because

root@control01:~# openstack network agent show 4e8a0855-cb6e-42e0-99d1-17916a0190ae
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up | UP ...

Revision history for this message
Miguel Lavalle (minsel) wrote :

Looking at your neutron server log, I can see that it attempts to bind the port to host "compute01-api":

2018-04-12 10:41:22.149 17106 DEBUG neutron.plugins.ml2.managers [req-9534cbca-5037-44ea-9ee2-83c76fbff74d 488fbd0f54b54ad797117db585b64bf8 7843b86bf7e34853be5b1cde8b4e1ad4 - default default] Attempting to bind port 12a91c8a-b2ca-458a-b762-b695b07e2475 on host compute01-api for vnic_type normal with profile bind_port /usr/lib/python2.7/dist-packages/neutron/plugins/ml2/managers.py:745

The result of executing "openstack network agent show 4e8a0855-cb6e-42e0-99d1-17916a0190ae" shows:

| host | compute01.demo.infra.cloud.loc

So in the neutron server log you can see:

2018-04-12 10:41:22.153 17106 DEBUG neutron.plugins.ml2.drivers.mech_agent [req-9534cbca-5037-44ea-9ee2-83c76fbff74d 488fbd0f54b54ad797117db585b64bf8 7843b86bf7e34853be5b1cde8b4e1ad4 - default default] Port 12a91c8a-b2ca-458a-b762-b695b07e2475 on network fdde9f33-b0fc-4b81-b870-ae4467d97f3d not bound, no agent of type Open vSwitch agent registered on host compute01-api bind_port /usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/mech_agent.py:101

Revision history for this message
Annie Melen (anniemelen) wrote :
Download full text (3.8 KiB)

This makes sense, but

1) root@control01:~# cat /etc/hosts
127.0.0.1 localhost
172.31.206.19 control-vip.demo.infra.cloud.loc control-vip
172.31.200.20 control01.demo.infra.cloud.loc control01
172.31.200.21 control02.demo.infra.cloud.loc control02
172.31.200.25 compute01.demo.infra.cloud.loc compute01
172.31.200.26 compute02.demo.infra.cloud.loc compute02
172.31.200.30 store01.demo.infra.cloud.loc store01
172.31.200.31 store02.demo.infra.cloud.loc store02
172.31.206.20 control01-api.demo.infra.cloud.loc control01-api
172.31.206.21 control02-api.demo.infra.cloud.loc control02-api
172.31.206.25 compute01-api.demo.infra.cloud.loc compute01-api
172.31.206.26 compute02-api.demo.infra.cloud.loc compute02-api

2) root@control01:~# cat /etc/neutron/neutron.conf | egrep ^[^#]
[DEFAULT]
bind_host = 172.31.206.20
bind_port = 9696
...

3) root@control01:~# netstat -ltupn | grep 9696
tcp 0 0 172.31.206.20:9696 0.0.0.0:* LISTEN 11416/python2
tcp 0 0 172.31.206.19:9696 0.0.0.0:* LISTEN 19814/haproxy

4) root@control01:~# hostname -f
control01.demo.infra.cloud.loc

I don't understand, why all agents are registered via FQDN <host>.demo.infra.cloud.loc not via FQDN <host>-api.demo.infra.cloud.loc...

All consumers according neutron-server log:
    DHCP <email address hidden>
    DHCP <email address hidden>
    L3 <email address hidden>
    L3 <email address hidden>
    L3 <email address hidden>
    L3 <email address hidden>
    Metadata <email address hidden>
    Metadata <email address hidden>
    Metadata <email address hidden>
    Metadata <email address hidden>
    Metering <email address hidden>
    Metering <email address hidden>
    Metering <email address hidden>
    Open vSwitch <email address hidden>
    Open vSwitch <email address hidden>
    Open vSwitch <email address hidden>
    Open vSwitch <email address hidden>

And one more thing, take a look at nova:

root@control01:~# cat /etc/nova/nova.conf | egrep ^[^#]
[DEFAULT]
my_ip = 172.31.206.20
host = control01-api
osapi_compute_listen = control01-api
osapi_compute_listen_port = 8774
...

root@control01:~# netstat -ltupn | grep 8774
tcp 0 0 172.31.206.20:8774 0.0.0.0:* LISTEN 4650/python
tcp 0 0 172.31.206.19:8774 0.0.0.0:* LISTEN 19814/haproxy

root@control01:~# openstack compute service list
+----+------------------+---------------+----------+----------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+------------------+---------------+----------+----------+-------+----------------------------+
| 2 | nova-conductor | control01-api | internal | enabled | up | 2018-04-12T23:38:26.000000 |
| 7 | nova-consoleauth | control01-api | internal | enabled | up | 2018-04-12T23:38:25.000000 |
| 9 | nova-scheduler | contr...

Read more...

Revision history for this message
Miguel Lavalle (minsel) wrote :

The OVS agent in the compute where you are trying to bind the port reports its state here: https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L316 with a state built here https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L261. Note that this state inludes the host, which the agent got from configuration here https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L159. That configuration option is defined here: https://github.com/openstack/neutron/blob/master/neutron/conf/common.py#L94. Note that its default value is whatever is returned by https://github.com/openstack/neutron-lib/blob/master/neutron_lib/utils/net.py#L25. Here's the description of socket.gethostname https://docs.python.org/2/library/socket.html#socket.gethostname.

Either you are specifying a host name for the agent or you are using the default (explained in the lines just above) and that doesn't match what Nova is using as host name when it requests Neutron to bind the port. So you clearly have a configuration issue. It is not that Neutron Queens cannot bind a port. Marking this bug as invalid

Changed in neutron:
status: New → Invalid
Revision history for this message
Annie Melen (anniemelen) wrote :

Miguel,
thanks for your help and patience! You're absolutely right, it's my configuration mistake. I removed 'host' from nova.conf, and now it's finally working.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.