Network partition - lost connectivity - ovs staleness?

Bug #1882519 reported by David O Neill on 2020-06-08
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openvswitch (Ubuntu)
Undecided
Unassigned

Bug Description

Short description
====================================
We have intermittent reoccurance of what appears to be OVS being dead.
Clients lose connection to the networks (vxlan,gre) with no root cause and any visible sign of issue. "Lights are on but noone is home"

Steps to reproduce
===================
unknown

Steps to resolve
=================
restart the VM
`openstack server restart <id>`

Newly created vms are not impacted and work immediately.

Sospreorts
===========
<email address hidden>
sosreport-..............-00282744-2020-06-08-typrlwu.tar.xz
sosreport-...........-00282744-2020-06-08-qdvfplm.tar.xz

openvswitch
===========
dpkg --list | grep -i openv | awk '{print $2 " " $3 }'
neutron-openvswitch-agent 2:12.1.0-0ubuntu1~cloud0
neutron-plugin-openvswitch-agent 2:12.1.0-0ubuntu1~cloud0
openvswitch-common 2.9.5-0ubuntu0.18.04.1~cloud0
openvswitch-switch 2.9.5-0ubuntu0.18.04.1~cloud0
python-openvswitch 2.9.5-0ubuntu0.18.04.1~cloud0

qemu
====
dpkg --list | grep -i qemu | awk '{print $2 " " $3 }'
ipxe-qemu 1.0.0+git-20180124.fbe8c52d-0ubuntu2.2~cloud0
ipxe-qemu-256k-compat-efi-roms 1.0.0+git-20150424.a25a16d-0ubuntu2~cloud0
qemu 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-block-extra:amd64 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-kvm 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-slof 20151103+dfsg-1ubuntu1.1
qemu-system 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-arm 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-common 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-mips 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-misc 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-ppc 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-s390x 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-sparc 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-system-x86 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-user 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-user-binfmt 1:2.11+dfsg-1ubuntu7.23~cloud0
qemu-utils 1:2.11+dfsg-1ubuntu7.23~cloud0

Details
=======
openstack server show f0c5b261-73ed-43cc-bd5d-82d181680d87
+-------------------------------------+------------------------------------------------------------------------+
| Field | Value |
+-------------------------------------+------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | hidden to protect customer |
| OS-EXT-SRV-ATTR:host | hidden to protect customer |
| OS-EXT-SRV-ATTR:hypervisor_hostname | hidden to protect customer |
| OS-EXT-SRV-ATTR:instance_name | instance-0000b9fc |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2020-01-28T13:16:46.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | DCS1-BO-SERVICES=172.16.15.22, xxxxxxxxxxxxxx |

VM interface 172.16.15.22
===============================
virsh domiflist instance-0000b9fc
Interface Type Source Model MAC
-------------------------------------------------------
tapba67f7c2-bc bridge br-int virtio fa:16:3e:b5:56:26

OVS stat info
=============
icmp, dns, dhcp, flows are visible
/snap/bin/ovs-stat -p results --host dcs1-clp-nod23 --tree
https://pastebin.canonical.com/p/NMZSRCqdPh/

Dead icmp
=========
ping 172.16.15.22
PING 172.16.15.22 (172.16.15.22) 56(84) bytes of data.
^C
--- 172.16.15.22 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1031ms

description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers