Activity log for bug #1460164

Date Who What changed Old value New value Message
2015-05-29 17:56:43 James Troup bug added bug
2015-05-29 17:56:55 James Troup bug added subscriber The Canonical Sysadmins
2015-05-31 17:16:23 Nobuto Murata bug added subscriber Nobuto Murata
2015-06-01 10:26:07 Launchpad Janitor neutron (Ubuntu): status New Confirmed
2015-06-01 10:43:20 William Grant bug added subscriber William Grant
2015-07-24 11:40:06 James Page neutron (Ubuntu): status Confirmed Triaged
2015-07-24 11:40:08 James Page neutron (Ubuntu): importance Undecided High
2015-12-17 18:20:25 JuanJo Ciarlante tags canonical-bootstack
2015-12-17 21:24:48 mahmoh bug added subscriber M.Morana
2015-12-18 14:27:12 James Page summary upgrade of openvswitch-switch can sometimes break neutron-plugin-openvswitch-agent restart of openvswitch-switch causes instance network down when l2population enabled
2015-12-18 14:49:24 James Page bug task added neutron
2015-12-18 15:05:01 OpenStack Infra neutron: status New In Progress
2015-12-18 15:05:01 OpenStack Infra neutron: assignee James Page (james-page)
2015-12-20 20:21:49 Edward Hope-Morley tags canonical-bootstack canonical-bootstack sts
2016-01-21 11:38:20 Miguel Angel Ajo tags canonical-bootstack sts canonical-bootstack kilo-backport-potential liberty-backport-potential sts
2016-01-26 09:47:18 Rossella Sblendido tags canonical-bootstack kilo-backport-potential liberty-backport-potential sts canonical-bootstack kilo-backport-potential l2-pop liberty-backport-potential sts
2016-01-26 14:16:32 OpenStack Infra neutron: status In Progress Fix Released
2016-01-31 04:41:06 OpenStack Infra tags canonical-bootstack kilo-backport-potential l2-pop liberty-backport-potential sts canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts
2016-02-11 14:38:18 James Page nominated for series Ubuntu Xenial
2016-02-11 14:38:18 James Page bug task added neutron (Ubuntu Xenial)
2016-02-11 14:38:18 James Page nominated for series Ubuntu Trusty
2016-02-11 14:38:18 James Page bug task added neutron (Ubuntu Trusty)
2016-02-11 14:38:18 James Page nominated for series Ubuntu Wily
2016-02-11 14:38:18 James Page bug task added neutron (Ubuntu Wily)
2016-02-11 14:38:28 James Page neutron (Ubuntu Xenial): status Triaged Fix Released
2016-02-11 14:38:33 James Page neutron (Ubuntu Wily): importance Undecided High
2016-02-11 14:41:01 James Page neutron (Ubuntu Wily): status New In Progress
2016-02-11 14:41:06 James Page neutron (Ubuntu Wily): assignee James Page (james-page)
2016-02-11 17:07:45 James Page bug added subscriber Ubuntu Stable Release Updates Team
2016-02-11 19:51:21 Launchpad Janitor branch linked lp:~ubuntu-server-dev/neutron/icehouse
2016-02-11 19:56:07 James Page description On 2015-05-28, our Landscape auto-upgraded packages on two of our OpenStack clouds. On both clouds, but only on some compute nodes, the upgrade of openvswitch-switch and corresponding downtime of ovs-vswitchd appears to have triggered some sort of race condition within neutron-plugin-openvswitch-agent leaving it in a broken state; any new instances come up with non-functional network but pre-existing instances appear unaffected. Restarting n-p-ovs-agent on the affected compute nodes is sufficient to work around the problem. The packages Landscape upgraded (from /var/log/apt/history.log): Start-Date: 2015-05-28 14:23:07 Upgrade: nova-compute-libvirt:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-login0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), nova-compute-kvm:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), systemd-services:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-common:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), nova-common:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), python-nova:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-daemon0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), libpam-systemd:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), udev:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub2-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), openvswitch-switch:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2), libudev1:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-client:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), python-eventlet:amd64 (0.13.0-1ubuntu2, 0.13.0-1ubuntu2.1), python-novaclient:amd64 (2.17.0-0ubuntu1.1, 2.17.0-0ubuntu1.2), grub-pc-bin:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), grub-pc:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), nova-compute:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), openvswitch-common:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2) End-Date: 2015-05-28 14:24:47 From /var/log/neutron/openvswitch-agent.log: 2015-05-28 14:24:18.336 47866 ERROR neutron.agent.linux.ovsdb_monitor [-] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) Looking at a stuck instances, all the right tunnels and bridges and what not appear to be there: root@vector:~# ip l l | grep c-3b 460002: qbr7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 460003: qvo7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000 460004: qvb7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UP mode DEFAULT group default qlen 1000 460005: tap7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UNKNOWN mode DEFAULT group default qlen 500 root@vector:~# ovs-vsctl list-ports br-int | grep c-3b qvo7ed8b59c-3b root@vector:~# But I can't ping the unit from within the qrouter-${id} namespace on the neutron gateway. If I tcpdump the {q,t}*c-3b interfaces, I don't see any traffic. [Impact] Restarts of openvswitch (typically on upgrade) result in loss of tunnel connectivity when the l2population driver is in use. This results in loss of access to all instances on the effected compute hosts [Test Case] Deploy cloud with ml2/ovs/l2population enabled boot instances restart ovs; instance connectivity will be lost until the neutron-openvswitch-agent is restarted on the compute hosts. [Regression Potential] Minimal - in multiple stable branches upstream. [Original Bug Report] On 2015-05-28, our Landscape auto-upgraded packages on two of our OpenStack clouds. On both clouds, but only on some compute nodes, the upgrade of openvswitch-switch and corresponding downtime of ovs-vswitchd appears to have triggered some sort of race condition within neutron-plugin-openvswitch-agent leaving it in a broken state; any new instances come up with non-functional network but pre-existing instances appear unaffected. Restarting n-p-ovs-agent on the affected compute nodes is sufficient to work around the problem. The packages Landscape upgraded (from /var/log/apt/history.log): Start-Date: 2015-05-28 14:23:07 Upgrade: nova-compute-libvirt:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-login0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), nova-compute-kvm:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), systemd-services:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-common:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), nova-common:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), python-nova:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), libsystemd-daemon0:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), libpam-systemd:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), udev:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), grub2-common:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), openvswitch-switch:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2), libudev1:amd64 (204-5ubuntu20.11, 204-5ubuntu20.12), isc-dhcp-client:amd64 (4.2.4-7ubuntu12.1, 4.2.4-7ubuntu12.2), python-eventlet:amd64 (0.13.0-1ubuntu2, 0.13.0-1ubuntu2.1), python-novaclient:amd64 (2.17.0-0ubuntu1.1, 2.17.0-0ubuntu1.2), grub-pc-bin:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), grub-pc:amd64 (2.02~beta2-9ubuntu1.1, 2.02~beta2-9ubuntu1.2), nova-compute:amd64 (2014.1.4-0ubuntu2, 2014.1.4-0ubuntu2.1), openvswitch-common:amd64 (2.0.2-0ubuntu0.14.04.1, 2.0.2-0ubuntu0.14.04.2) End-Date: 2015-05-28 14:24:47 From /var/log/neutron/openvswitch-agent.log: 2015-05-28 14:24:18.336 47866 ERROR neutron.agent.linux.ovsdb_monitor [-] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) Looking at a stuck instances, all the right tunnels and bridges and what not appear to be there: root@vector:~# ip l l | grep c-3b 460002: qbr7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 460003: qvo7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000 460004: qvb7ed8b59c-3b: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UP mode DEFAULT group default qlen 1000 460005: tap7ed8b59c-3b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr7ed8b59c-3b state UNKNOWN mode DEFAULT group default qlen 500 root@vector:~# ovs-vsctl list-ports br-int | grep c-3b qvo7ed8b59c-3b root@vector:~# But I can't ping the unit from within the qrouter-${id} namespace on the neutron gateway. If I tcpdump the {q,t}*c-3b interfaces, I don't see any traffic.
2016-02-11 19:56:13 James Page neutron (Ubuntu Trusty): status New In Progress
2016-02-11 19:56:19 James Page neutron (Ubuntu Trusty): assignee James Page (james-page)
2016-02-11 19:56:32 James Page bug task added cloud-archive
2016-02-11 19:56:49 James Page nominated for series cloud-archive/juno
2016-02-11 19:56:49 James Page bug task added cloud-archive/juno
2016-02-11 19:56:49 James Page nominated for series cloud-archive/kilo
2016-02-11 19:56:49 James Page bug task added cloud-archive/kilo
2016-02-11 19:56:59 James Page cloud-archive/kilo: importance Undecided Medium
2016-02-11 19:57:03 James Page cloud-archive/juno: importance Undecided Medium
2016-02-11 19:57:15 James Page neutron (Ubuntu Trusty): importance Undecided High
2016-02-15 09:50:37 OpenStack Infra cloud-archive/kilo: status New In Progress
2016-02-15 09:50:37 OpenStack Infra cloud-archive/kilo: assignee Ihar Hrachyshka (ihar-hrachyshka)
2016-02-17 16:07:14 Chris J Arges neutron (Ubuntu Trusty): status In Progress Fix Committed
2016-02-17 16:07:22 Chris J Arges bug added subscriber SRU Verification
2016-02-17 16:07:33 Chris J Arges tags canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-needed
2016-02-22 08:47:49 James Page nominated for series cloud-archive/icehouse
2016-02-22 08:47:49 James Page bug task added cloud-archive/icehouse
2016-02-22 08:47:57 James Page cloud-archive/icehouse: status New Fix Committed
2016-02-22 18:41:01 Chris J Arges neutron (Ubuntu Wily): status In Progress Fix Committed
2016-03-02 18:39:54 Corey Bryant tags canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-needed canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-done
2016-03-03 19:14:39 Brian Murray removed subscriber Ubuntu Stable Release Updates Team
2016-03-03 19:14:35 Launchpad Janitor neutron (Ubuntu Trusty): status Fix Committed Fix Released
2016-03-03 19:15:10 Launchpad Janitor neutron (Ubuntu Wily): status Fix Committed Fix Released
2016-03-03 21:06:03 Corey Bryant cloud-archive/icehouse: status Fix Committed Fix Released
2016-03-03 22:43:51 Ihar Hrachyshka cloud-archive/kilo: assignee Ihar Hrachyshka (ihar-hrachyshka)
2016-03-18 17:19:54 Launchpad Janitor branch linked lp:~ubuntu-server-dev/neutron/kilo
2016-03-18 17:24:37 James Page cloud-archive/kilo: assignee James Page (james-page)
2016-03-30 09:22:25 James Page cloud-archive/kilo: status In Progress Fix Released
2016-03-30 09:29:36 James Page cloud-archive: status In Progress Invalid
2016-04-12 14:15:26 OpenStack Infra tags canonical-bootstack in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-done canonical-bootstack in-stable-kilo in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-done
2016-05-09 11:44:07 Dave Walker nominated for series neutron/kilo
2016-05-09 11:44:08 Dave Walker bug task added neutron/kilo
2016-10-07 16:34:15 Ihar Hrachyshka tags canonical-bootstack in-stable-kilo in-stable-liberty kilo-backport-potential l2-pop liberty-backport-potential sts verification-done canonical-bootstack in-stable-kilo in-stable-liberty l2-pop sts verification-done
2016-12-01 22:36:15 Corey Bryant cloud-archive/juno: status New Won't Fix