Comment 0 for bug 1978820

Revision history for this message
Itai Levy (etlvnvda) wrote :

Platform: OpenStack Yoga, Ubuntu 22.04 Jammy, Kernel 5.15.0-37-generic

Charmed Openstack deployment with HW Offload over Jammy series will look ok until Vault initialization phase, then after initializing Vault all DB-related apps will end up in block/error state over "Failed to connect to MYSQL".
Connectivity testing between DB containers located on different nodes will show there is unexplained sporadic packet loss preventing proper communication between the DB related apps.

This will happen when the following conditions are met:
1. Control plane (oam, internal spaces) is configured as vlan interfaces on the same OVS bridge used for data plane (over high speed NIC with HW Offload capabilities).
2. OVS was set with HW offload=true (will happen by OVN chrams after Vault initialization)
3. NIC was not yet set to "switchdev" mode (netplan file will be created by OVN chrams after Vault initialization, however will take affect only after node is rebooted)

The root cause is the following missing kernel patch:
https://patchwork<email address hidden>/

To reproduce:
Deploy charmed openstack with HW offload while using control plane on the high speed NIC OVS bridge. Before initializing Vault login to one of the innoDB instances and ping the other 2 instances - all ok. Manually enable OVS HW Offload, ping will become inconsistent.

Workaround:
After the deployment bring-up phase, BEFORE enabling Vault, login to the nodes and manually create 150-charm-ovn.yaml (example below). Then reboot one node after another. When nodes recover proceed with Vault initialization to complete the deployment.

#root@node3:/home/ubuntu# cat /etc/netplan/150-charm-ovn.yaml
###############################################################################
# [ WARNING ]
# Configuration file maintained by Juju. Local changes may be overwritten.
# Config managed by ovn-chassis charm
###############################################################################
network:
  version: 2
  ethernets:
    ens1f0:
      virtual-function-count: 8
      embedded-switch-mode: switchdev
      delay-virtual-functions-rebind: true

    ens1f1:
      virtual-function-count: 8
      embedded-switch-mode: switchdev
      delay-virtual-functions-rebind: true