Activity log for bug #1179223

Date Who What changed Old value New value Message
2013-05-12 14:24:57 gregmark bug added bug
2013-05-13 08:55:32 Jiajun Liu summary Retired GRE tunnels spersists in quantum db Retired GRE tunnels persists in quantum db
2013-05-13 08:56:10 Jiajun Liu quantum: assignee Jiajun Liu (ljjjustin)
2013-06-10 16:13:49 Mark McClain tags ovs
2013-07-19 07:03:46 Jiajun Liu neutron: assignee Jiajun Liu (ljjjustin)
2013-09-06 15:44:28 Darragh O'Reilly bug added subscriber Darragh O'Reilly
2013-11-22 20:53:08 Mark McClain neutron: assignee Kyle Mestery (mestery)
2013-11-22 20:53:12 Mark McClain neutron: status New Incomplete
2014-01-29 12:42:01 Eugene Nelen bug added subscriber Eugene Nelen
2014-03-12 22:47:22 Pengfei Zhang neutron: assignee Kyle Mestery (mestery) Pengfei Zhang (eaglezpf)
2014-04-23 14:53:40 Awan bug added subscriber Awan
2014-06-17 13:44:26 Miguel Angel Ajo neutron: status Incomplete Confirmed
2014-06-17 13:44:43 Miguel Angel Ajo summary Retired GRE tunnels persists in quantum db Retired GRE and VXLAN tunnels persists in neutron db
2014-06-17 13:45:19 Miguel Angel Ajo description This is Grizzly on Ubuntu 13.04 (1:2013.1-0ubuntu2). Setup is multi-node, with per-tenant routers and gre tunneling. SYMPTOM: VM's are available on the external network for about 1-2 minutes, after which point the connection times out and cannot be re-established unless traffic is generated from the VM console. VMs with dhcp interface settings will periodically and temporarily come back on line after requesting new leases. When I attempt to ping from the external network, I can trace the traffic all the way to the tap interface on the compute node, where the VM responds to the arp request sent by the tenant router (which is on the separate network node). However, this arp reply never makes it back to the tenant router. It seems to die at the GRE terminus at bridge br-tun. CAUSE: * I have a three nics on my network node. The VM traffic goes out the 1st nic on 192.168.239.99/24 to the other compute nodes, while management traffic goes out the 2nd nic on 192.168.241.99. The 3rd nic is external and has no IP. * I have four GRE endpoints on the VM network, one at the network node (192.168.239.99) and three on compute nodes (192.168.239.{110,114,115}), all with IDs 2-5. * I have a fifth GRE endpoint with id 1 to 192.168.241.99, the network node's management interface, on each of the compute nodes. This was the first tunnel created when I deployed the network node because that is how I set the remote_ip in the ovs plugin ini. I corrected the setting later, but the 192.168.241.99 endpoint persists: mysql> select * from ovs_tunnel_endpoints; +-----------------+----+ | ip_address | id | +-----------------+----+ | 192.168.239.110 | 3 | | 192.168.239.114 | 4 | | 192.168.239.115 | 5 | | 192.168.239.99 | 2 | | 192.168.241.99 | 1 | <======== HERE +-----------------+----+ 5 rows in set (0.00 sec) * Thus, after plugin restarts or reboots, this endpoint is re-created every time. * The effect is that traffic from the VM has two possible flows from which to make a routing/switching decision. I was unable to determine how this decision is made, but obviously this is not a working configuration. Traffic the originates from the VM always seems to use the correct flow initially, but traffic which originates from the network node is never returned via the right flow unless the connection has been active within the previous 1-2 minutes. In both cases, successful connections timeout after 1-2 minutes of inactivity. SOLUTION: mysql> delete from ovs_tunnel_endpoints where id = 1; Query OK, 1 row affected (0.00 sec) mysql> select * from ovs_tunnel_endpoints; +-----------------+----+ | ip_address | id | +-----------------+----+ | 192.168.239.110 | 3 | | 192.168.239.114 | 4 | | 192.168.239.115 | 5 | | 192.168.239.99 | 2 | +-----------------+----+ 4 rows in set (0.00 sec) * After doing that, I simply restarted the quantum ovs agents on the network and compute nodes. The old GRE tunnel is not re-created. Thereafter, VM network traffic to and from the external network proceeds without incident. * Should these tables be cleaned up as well, I wonder: mysql> select * from ovs_network_bindings; +--------------------------------------+--------------+------------------+-----------------+ | network_id | network_type | physical_network | segmentation_id | +--------------------------------------+--------------+------------------+-----------------+ | 4e8aacca-8b38-40ac-a628-18cac3168fe6 | gre | NULL | 2 | | af224f3f-8de6-4e0d-b043-6bcd5cb014c5 | gre | NULL | 1 | +--------------------------------------+--------------+------------------+-----------------+ 2 rows in set (0.00 sec) mysql> select * from ovs_tunnel_allocations where allocated != 0; +-----------+-----------+ | tunnel_id | allocated | +-----------+-----------+ | 1 | 1 | | 2 | 1 | +-----------+-----------+ 2 rows in set (0.00 sec) Setup is multi-node, with per-tenant routers and gre or vxlan tunneling, ovs or ML2, both affected. SYMPTOM: VM's are available on the external network for about 1-2 minutes, after which point the connection times out and cannot be re-established unless traffic is generated from the VM console. VMs with dhcp interface settings will periodically and temporarily come back on line after requesting new leases. When I attempt to ping from the external network, I can trace the traffic all the way to the tap interface on the compute node, where the VM responds to the arp request sent by the tenant router (which is on the separate network node). However, this arp reply never makes it back to the tenant router. It seems to die at the GRE terminus at bridge br-tun. CAUSE: * I have a three nics on my network node. The VM traffic goes out the 1st nic on 192.168.239.99/24 to the other compute nodes, while management traffic goes out the 2nd nic on 192.168.241.99. The 3rd nic is external and has no IP. * I have four GRE endpoints on the VM network, one at the network node (192.168.239.99) and three on compute nodes (192.168.239.{110,114,115}), all with IDs 2-5. * I have a fifth GRE endpoint with id 1 to 192.168.241.99, the network node's management interface, on each of the compute nodes. This was the first tunnel created when I deployed the network node because that is how I set the remote_ip in the ovs plugin ini. I corrected the setting later, but the 192.168.241.99 endpoint persists: mysql> select * from ovs_tunnel_endpoints; +-----------------+----+ | ip_address | id | +-----------------+----+ | 192.168.239.110 | 3 | | 192.168.239.114 | 4 | | 192.168.239.115 | 5 | | 192.168.239.99 | 2 | | 192.168.241.99 | 1 | <======== HERE +-----------------+----+ 5 rows in set (0.00 sec) * Thus, after plugin restarts or reboots, this endpoint is re-created every time. * The effect is that traffic from the VM has two possible flows from which to make a routing/switching decision. I was unable to determine how this decision is made, but obviously this is not a working configuration. Traffic the originates from the VM always seems to use the correct flow initially, but traffic which originates from the network node is never returned via the right flow unless the connection has been active within the previous 1-2 minutes. In both cases, successful connections timeout after 1-2 minutes of inactivity. SOLUTION: mysql> delete from ovs_tunnel_endpoints where id = 1; Query OK, 1 row affected (0.00 sec) mysql> select * from ovs_tunnel_endpoints; +-----------------+----+ | ip_address | id | +-----------------+----+ | 192.168.239.110 | 3 | | 192.168.239.114 | 4 | | 192.168.239.115 | 5 | | 192.168.239.99 | 2 | +-----------------+----+ 4 rows in set (0.00 sec) * After doing that, I simply restarted the quantum ovs agents on the network and compute nodes. The old GRE tunnel is not re-created. Thereafter, VM network traffic to and from the external network proceeds without incident. * Should these tables be cleaned up as well, I wonder: mysql> select * from ovs_network_bindings; +--------------------------------------+--------------+------------------+-----------------+ | network_id | network_type | physical_network | segmentation_id | +--------------------------------------+--------------+------------------+-----------------+ | 4e8aacca-8b38-40ac-a628-18cac3168fe6 | gre | NULL | 2 | | af224f3f-8de6-4e0d-b043-6bcd5cb014c5 | gre | NULL | 1 | +--------------------------------------+--------------+------------------+-----------------+ 2 rows in set (0.00 sec) mysql> select * from ovs_tunnel_allocations where allocated != 0; +-----------+-----------+ | tunnel_id | allocated | +-----------+-----------+ | 1 | 1 | | 2 | 1 | +-----------+-----------+ 2 rows in set (0.00 sec)
2014-06-17 15:24:46 Eugene Nikanorov neutron: importance Undecided Medium
2014-06-25 15:17:26 Shiv Haris neutron: milestone juno-2
2014-07-22 13:07:05 Kyle Mestery neutron: milestone juno-2 juno-3
2014-08-22 13:24:59 Romil Gupta bug added subscriber Romil Gupta
2014-08-25 09:19:03 Romil Gupta neutron: assignee Pengfei Zhang (eaglezpf) Romil Gupta (romilg)
2014-09-02 16:12:54 Nobuto Murata bug added subscriber Nobuto MURATA
2014-09-03 00:41:27 Hua Zhang bug added subscriber Hua Zhang
2014-09-03 15:45:07 Thierry Carrez neutron: milestone juno-3 juno-rc1
2014-09-11 10:18:49 Phani Pawan bug added subscriber phanipawan
2014-09-12 07:13:44 OpenStack Infra neutron: status Confirmed In Progress
2014-09-17 14:20:09 Kyle Mestery neutron: milestone juno-rc1 kilo-1
2014-09-17 14:20:15 Kyle Mestery neutron: importance Medium High
2014-09-17 14:20:18 Kyle Mestery neutron: milestone kilo-1 juno-rc1
2014-09-27 22:52:15 Kyle Mestery neutron: milestone juno-rc1
2014-10-08 16:26:33 Robert Kukura neutron: milestone kilo-1
2014-10-25 15:30:29 Romil Gupta tags ovs ml2
2014-10-25 15:44:27 Romil Gupta tags ml2 ml2 ovs
2014-12-16 22:13:04 Kyle Mestery neutron: milestone kilo-1 kilo-2
2015-02-04 21:46:52 Kyle Mestery neutron: milestone kilo-2 kilo-3
2015-02-20 21:25:59 OpenStack Infra neutron: status In Progress Fix Committed
2015-03-19 16:27:33 Thierry Carrez neutron: status Fix Committed Fix Released
2015-04-30 09:43:57 Thierry Carrez neutron: milestone kilo-3 2015.1.0