reboot ovs service lose dhcp port in dhcp namespace

Bug #1709779 reported by jinke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Unassigned

Bug Description

a. i install ocata on two host, all agent work well, as below:
[root@controller openstack]# openstack network agent list
+--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+
| 1296f653-7e28-47dc-b0c7-73e9fabb695f | Metadata agent | controller | None | True | UP | neutron-metadata-agent |
| 47bd5b59-feb7-47a6-864e-0cf7ed90ab8e | Open vSwitch agent | compute | None | True | UP | neutron-openvswitch-agent |
| 9d8f5a9d-2fd4-4c6f-b6d6-1730843738e3 | DHCP agent | controller | nova | True | UP | neutron-dhcp-agent |
| c420da8e-7028-4589-bd2f-9d25756e08f2 | Open vSwitch agent | controller | None | True | UP | neutron-openvswitch-agent |
| f79bf249-874b-422a-9d21-949786fbf367 | L3 agent | controller | nova | True | UP | neutron-l3-agent |
+--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+
[root@controller openstack]# openstack compute service list
+----+------------------+------------+----------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+------------------+------------+----------+---------+-------+----------------------------+
| 1 | nova-consoleauth | controller | internal | enabled | up | 2017-08-10T05:47:04.000000 |
| 3 | nova-conductor | controller | internal | enabled | up | 2017-08-10T05:47:04.000000 |
| 7 | nova-scheduler | controller | internal | enabled | up | 2017-08-10T05:47:05.000000 |
| 10 | nova-compute | controller | nova | enabled | up | 2017-08-10T05:47:06.000000 |
| 11 | nova-compute | compute | nova | enabled | up | 2017-08-10T05:47:09.000000 |
+----+------------------+------------+----------+---------+-------+----------------------------+

b. create a tenant with vlan mode,
[root@controller openstack]# ip netns
qdhcp-006b70a9-9c44-40e9-b3a1-3334a472dda6
[root@controller openstack]# ip netns exec qdhcp-006b70a9-9c44-40e9-b3a1-3334a472dda6 ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
        inet 127.0.0.1 netmask 255.0.0.0
        inet6 ::1 prefixlen 128 scopeid 0x10<host>
        loop txqueuelen 1 (Local Loopback)
        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

tapbfe934a3-9d: flags=323<UP,BROADCAST,RUNNING,PROMISC> mtu 1500
        inet 1.2.3.4 netmask 255.255.255.0 broadcast 1.2.3.255
        inet6 fe80::f816:3eff:feed:ea19 prefixlen 64 scopeid 0x20<link>
        ether fa:16:3e:ed:ea:19 txqueuelen 1000 (Ethernet)
        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 5 bytes 438 (438.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

3. when reboot ovs sevice, port tapbfe934a3-9d in dhcp namespace will be lose
[root@controller openstack]# systemctl restart openvswitch
[root@controller openstack]# ip netns exec qdhcp-006b70a9-9c44-40e9-b3a1-3334a472dda6 ifconfig -a
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
        inet 127.0.0.1 netmask 255.0.0.0
        inet6 ::1 prefixlen 128 scopeid 0x10<host>
        loop txqueuelen 1 (Local Loopback)
        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

4. when reboot dhcp agent, that port appear again
[root@controller openstack]# systemctl restart neutron-dhcp-agent.service
[root@controller openstack]# ip netns exec qdhcp-006b70a9-9c44-40e9-b3a1-3334a472dda6 ifconfig -a
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
        inet 127.0.0.1 netmask 255.0.0.0
        inet6 ::1 prefixlen 128 scopeid 0x10<host>
        loop txqueuelen 1 (Local Loopback)
        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 0 bytes 0 (0.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

tapbfe934a3-9d: flags=323<UP,BROADCAST,RUNNING,PROMISC> mtu 1500
        inet 1.2.3.4 netmask 255.255.255.0 broadcast 1.2.3.255
        inet6 fe80::f816:3eff:feed:ea19 prefixlen 64 scopeid 0x20<link>
        ether fa:16:3e:ed:ea:19 txqueuelen 1000 (Ethernet)
        RX packets 0 bytes 0 (0.0 B)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 5 bytes 438 (438.0 B)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

as describe above, i want to know weather this is normal

Tags: ovs
Revision history for this message
Boden R (boden) wrote :
Download full text (5.0 KiB)

I've tried reproducing this with a 2 node devstack installation on Ubuntu 16.04.1; based on the info given it's the best I could do to duplicate the failing environment. However I haven't been able to reproduce this issue, neither with a single node (all in one) or 2 node setup.

Below is the output from a 2 node env where I'm trying to reproduce:
-->
stack@ocata:~/devstack$ openstack network agent list
+----------------------+--------------------+--------------+-------------------+-------+-------+-------------------------+
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+----------------------+--------------------+--------------+-------------------+-------+-------+-------------------------+
| 0048116c-4405-48f2-8 | Open vSwitch agent | ocata | None | True | UP | neutron-openvswitch- |
| 288-a886b7770114 | | | | | | agent |
| 1fc1db67-57fc-41fb- | DHCP agent | ocata | nova | True | UP | neutron-dhcp-agent |
| 84e7-4ed100593d0b | | | | | | |
| 7c65e0bd-cf09-4ab4 | Metadata agent | ocata | None | True | UP | neutron-metadata-agent |
| -9e3f-3eb34b1d4244 | | | | | | |
| 8908f435-a2c8-4866 | Open vSwitch agent | ocatastackn1 | None | True | UP | neutron-openvswitch- |
| -b2ca-d695428cf59a | | | | | | agent |
| 9fe9a8a9-3caa-4a8e- | L3 agent | ocata | nova | True | UP | neutron-l3-agent |
| 8ccb-c325726117d1 | | | | | | |
+----------------------+--------------------+--------------+-------------------+-------+-------+-------------------------+
stack@ocata:~/devstack$ ip netns list
qdhcp-5d63c728-ce5e-4c34-b804-ef76532278f6
qrouter-41227234-0ee7-420b-98e4-198152600904
stack@ocata:~/devstack$ sudo ip netns exec qdhcp-5d63c728-ce5e-4c34-b804-ef76532278f6 ifconfig
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

tap69d583e5-1d Link encap:Ethernet HWaddr fa:16:3e:2d:b7:02
          inet addr:10.0.0.2 Bcast:10.0.0.63 Mask:255.255.255.192
          inet6 addr: fe80::f816:3eff:fe2d:b702/64 Scope:Link
          inet6 addr: fd95:54d:b5e0:0:f816:3eff:fe2d:b702/64 Scope:Global
          UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          ...

Read more...

Changed in neutron:
status: New → Incomplete
Boden R (boden)
tags: added: ovs
Revision history for this message
jinke (ke.king) wrote :
Download full text (10.8 KiB)

I try again, but it reproduce easyily.
log info about dhcp-agent and ovs-vswitchd when restarting ovs service:
[root@controller openvswitch-2.6.1]# tailf /var/log/neutron/dhcp-agent.log
2017-08-11 03:07:16.703 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connection closed by peer
2017-08-11 03:07:17.703 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
2017-08-11 03:07:17.704 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected
2017-08-11 03:07:17.706 17827 WARNING neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: send error: Connection refused
2017-08-11 03:07:17.807 17827 WARNING neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connection dropped (Connection refused)
2017-08-11 03:07:17.807 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: waiting 2 seconds before reconnect
2017-08-11 03:07:19.807 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connecting...
2017-08-11 03:07:19.808 17827 INFO neutron.agent.ovsdb.native.vlog [-] tcp:127.0.0.1:6640: connected

[root@controller ~]# tailf /var/log/openvswitch/ovs-vswitchd.log
2017-08-11T07:07:15.487Z|00320|rconn|INFO|br-vlan<->tcp:127.0.0.1:6633: connection closed by peer
2017-08-11T07:07:15.487Z|00321|rconn|INFO|br-int<->tcp:127.0.0.1:6633: connection closed by peer
2017-08-11T07:07:15.487Z|00322|rconn|INFO|br-ex<->tcp:127.0.0.1:6633: connection closed by peer
2017-08-11T07:07:17.632Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
2017-08-11T07:07:17.636Z|00002|ovs_numa|INFO|Discovered 12 CPU cores on NUMA node 0
2017-08-11T07:07:17.636Z|00003|ovs_numa|INFO|Discovered 12 CPU cores on NUMA node 1
2017-08-11T07:07:17.636Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes and 24 CPU cores
2017-08-11T07:07:17.636Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2017-08-11T07:07:17.636Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2017-08-11T07:07:17.640Z|00007|dpdk|ERR|DPDK not supported in this copy of Open vSwitch.
2017-08-11T07:07:17.646Z|00008|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation
2017-08-11T07:07:17.646Z|00009|ofproto_dpif|INFO|netdev@ovs-netdev: MPLS label stack length probed as 3
2017-08-11T07:07:17.646Z|00010|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports truncate action
2017-08-11T07:07:17.646Z|00011|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports unique flow ids
2017-08-11T07:07:17.646Z|00012|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_state
2017-08-11T07:07:17.646Z|00013|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_zone
2017-08-11T07:07:17.646Z|00014|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_mark
2017-08-11T07:07:17.646Z|00015|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports ct_label
2017-08-11T07:07:17.646Z|00016|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath does not support ct_state_nat
2017-08-11T07:07:17.725Z|00017|bridge|INFO|bridge br-int: added interface qg-1c51a194-ca on port 4
2017-08-11T07:07:17.725Z|00018|bridge|INFO|bridge br-int: added interface int-br-vlan on port 5
2017-08-11T07:07:17.725Z|00019|...

Revision history for this message
Boden R (boden) wrote :

Moving back to 'new'. I can't reproduce, but suspect I'm using a different network setup/topo which may be why I can't see the problem.

I'm going to ask someone who's more familiar with this functionality to take a peek; maybe there's something obvious.

Changed in neutron:
status: Incomplete → New
Revision history for this message
jinke (ke.king) wrote :

Hello, Boden:
  I have installed ocata using devstack today, this issue can't reproduce too. So I think my configuration maybe incorrect. I will double check to figure out what's wrong.

Moving it to 'invalid'

Changed in neutron:
status: New → Invalid
Revision history for this message
Boden R (boden) wrote :

Thank you for retrying and updating the bug with the latest findings/results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.