Activity log for bug #1980527

Date Who What changed Old value New value Message
2022-07-01 14:07:07 Stefan Lupsa bug added bug
2022-07-01 14:07:07 Stefan Lupsa attachment added System, Kernel and package versions https://bugs.launchpad.net/bugs/1980527/+attachment/5600857/+files/package-versions.txt
2022-07-01 14:08:05 Stefan Lupsa summary 5.13.0.44 openvswith kerenel module meter implementation breaks on ovs vswitchd restart 5.13.0.44 openvswitch kernel module meter implementation breaks on ovs vswitchd restart
2022-07-01 14:09:47 Launchpad Janitor linux (Ubuntu): status New Confirmed
2022-07-01 14:11:21 Stefan Lupsa description Behavior: OVN Kolla Openstack deployment with docker containers. With 300 VM deployed on a compute node with some synthetic network load between the machines caused by pings and netperf, when the opvnswitch_vswitchd is restarted the meter implementation breaks for the opvnswitch kernel module and the flows can't be added back. This does not happen with the default 5.4.0.121.122 kernel, only on the linux-image-generic-hwe-20.04 5.13.0.44.49~20.04.28. It also does not happen until sufficient machines / load is present on the system (with 50 machines the behavior is not present). Before restart: (openvswitch-vswitchd)# ovs-ofctl meter-features br-int -O OpenFlow15 OFPST_METER_FEATURES reply (OF1.5) (xid=0x2): max_meter:200000 max_bands:1 max_color:0 band_types: drop capabilities: kbps pktps burst stats After restart: (openvswitch-vswitchd)# ovs-ofctl meter-features br-int -O OpenFlow15 OFPST_METER_FEATURES reply (OF1.5) (xid=0x2): max_meter:0 max_bands:0 max_color:0 band_types: capabilities: Following logs are shown in openvswitch-vswitchd and ovn-controller logs preventing logs from being added back: ovs-vswitchd.log --- 2022-06-29T12:43:49.188Z|00006|dpif(handler1)|WARN|system@ovs-system: failed to put[create] (Invalid argument) ufid:73695f68-6778-4980-bb74-29b528036b57 recirc_id(0),dp_hash(0),skb_priority(0),in_port(303),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=5c:45:27:f9:9a:02,dst=01:00:0c:cc:cc:cd),eth_type(0x8100),vlan(vid=214,pcp=7),encap() 2022-06-29T12:43:49.204Z|00036|dpif_netlink|INFO|dpif_netlink_meter_transact OVS_METER_CMD_SET failed 2022-06-29T12:43:49.204Z|00037|dpif_netlink|INFO|dpif_netlink_meter_transact OVS_METER_CMD_SET failed 2022-06-29T12:43:49.204Z|00038|dpif_netlink|INFO|dpif_netlink_meter_transact get failed 2022-06-29T12:43:49.204Z|00039|dpif_netlink|INFO|The kernel module has a broken meter implementation. 2022-06-29T12:43:49.208Z|00040|dpif|WARN|system@ovs-system: failed to query port patch-br-int-to-provnet-90238e1e-fbf5-45e4-bc6c-6110823d58ed: Invalid argument 2022-06-29T12:43:49.209Z|00041|dpif|WARN|system@ovs-system: failed to query port patch-br-int-to-provnet-b0a964fc-13a9-48ce-9a1b-7cfe8fc2b979: Invalid argument [...] 2022-06-29T12:44:56.778Z|00585|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00586|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00587|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00588|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00589|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00590|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00591|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00592|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00593|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00594|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message ovn-controller.log --- 2022-06-29T12:44:56.785Z|00165|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x388f): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x388f): ADD meter=1565 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00166|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3890): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3890): ADD meter=3427 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00167|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3891): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3891): ADD meter=3766 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00168|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3892): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3892): ADD meter=1225 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00169|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3893): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3893): ADD meter=2853 kbps stats bands= type=drop rate=250000 The OVS bridges also cannot be added to the system manually but OVS adds them to the database even though the operation fails: ovs-vswitchd.log --- 2022-06-29T12:12:51.874Z|00071|bridge|INFO|bridge br-int: added interface br-int on port 65534 2022-06-29T12:12:51.877Z|00072|bridge|INFO|bridge br-int: using datapath ID 000026586eff73c0 2022-06-29T12:12:51.877Z|00073|connmgr|INFO|br-int: added service controller "punix:/var/run/openvswitch/br-int.mgmt" 2022-06-29T12:13:16.157Z|00074|bridge|INFO|bridge br-int: deleted interface br-int on port 65534 2022-06-29T12:13:31.074Z|00075|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:13:31.074Z|00076|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:13:31.074Z|00077|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:13:31.074Z|00078|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory 2022-06-29T12:13:31.074Z|00079|ofproto|ERR|failed to open datapath br-int: No such file or directory 2022-06-29T12:13:31.074Z|00080|bridge|ERR|failed to create bridge br-int: No such file or directory 2022-06-29T12:15:32.189Z|00081|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:15:32.189Z|00082|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:15:32.189Z|00083|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:15:32.189Z|00084|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory 2022-06-29T12:15:32.189Z|00085|ofproto|ERR|failed to open datapath br-int: No such file or directory 2022-06-29T12:15:32.189Z|00086|bridge|ERR|failed to create bridge br-int: No such file or directory 2022-06-29T12:18:43.488Z|00087|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:18:43.488Z|00088|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:18:43.488Z|00089|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:18:43.488Z|00090|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory Mitigation steps: The only way currently to bring it back to operation, that we've found, without rebooting the host is to reload the kernel module. To to this we have to back up the ovsdb conf.db, start all the containers and delete the bridges so that we can reload the openvswitch module. Then restore the conf.db and start back the containers sequentially. Openvswitch reports compatibility for 2.16.x between kernel versions 3.16 to 5.8 Ref: https://docs.openvswitch.org/en/latest/faq/releases/ Behavior: OVN Kolla Openstack deployment with docker containers. With 300 VM deployed on a compute node with some synthetic network load between the machines caused by pings and netperf, when the opvnswitch_vswitchd is restarted the meter implementation breaks for the opvnswitch kernel module and the flows can't be added back. This does not happen with the default 5.4.0.121.122 kernel, only on the linux-image-generic-hwe-20.04 5.13.0.44.49~20.04.28. It also does not happen until sufficient machines / load is present on the system (with 50 machines the behavior is not present). Before restart: (openvswitch-vswitchd)# ovs-ofctl meter-features br-int -O OpenFlow15 OFPST_METER_FEATURES reply (OF1.5) (xid=0x2): max_meter:200000 max_bands:1 max_color:0 band_types: drop capabilities: kbps pktps burst stats After restart: (openvswitch-vswitchd)# ovs-ofctl meter-features br-int -O OpenFlow15 OFPST_METER_FEATURES reply (OF1.5) (xid=0x2): max_meter:0 max_bands:0 max_color:0 band_types: capabilities: Following logs are shown in openvswitch-vswitchd and ovn-controller logs preventing logs from being added back: ovs-vswitchd.log --- 2022-06-29T12:43:49.188Z|00006|dpif(handler1)|WARN|system@ovs-system: failed to put[create] (Invalid argument) ufid:73695f68-6778-4980-bb74-29b528036b57 recirc_id(0),dp_hash(0),skb_priority(0),in_port(303),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=5c:45:27:f9:9a:02,dst=01:00:0c:cc:cc:cd),eth_type(0x8100),vlan(vid=214,pcp=7),encap() 2022-06-29T12:43:49.204Z|00036|dpif_netlink|INFO|dpif_netlink_meter_transact OVS_METER_CMD_SET failed 2022-06-29T12:43:49.204Z|00037|dpif_netlink|INFO|dpif_netlink_meter_transact OVS_METER_CMD_SET failed 2022-06-29T12:43:49.204Z|00038|dpif_netlink|INFO|dpif_netlink_meter_transact get failed 2022-06-29T12:43:49.204Z|00039|dpif_netlink|INFO|The kernel module has a broken meter implementation. 2022-06-29T12:43:49.208Z|00040|dpif|WARN|system@ovs-system: failed to query port patch-br-int-to-provnet-90238e1e-fbf5-45e4-bc6c-6110823d58ed: Invalid argument 2022-06-29T12:43:49.209Z|00041|dpif|WARN|system@ovs-system: failed to query port patch-br-int-to-provnet-b0a964fc-13a9-48ce-9a1b-7cfe8fc2b979: Invalid argument [...] 2022-06-29T12:44:56.778Z|00585|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00586|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00587|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00588|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00589|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00590|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00591|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00592|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00593|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message 2022-06-29T12:44:56.778Z|00594|connmgr|INFO|br-int<->unix#3: sending OFPMMFC_INVALID_METER error reply to OFPT_METER_MOD message ovn-controller.log --- 2022-06-29T12:44:56.785Z|00165|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x388f): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x388f): ADD meter=1565 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00166|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3890): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3890): ADD meter=3427 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00167|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3891): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3891): ADD meter=3766 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00168|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3892): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3892): ADD meter=1225 kbps stats bands= type=drop rate=250000 2022-06-29T12:44:56.785Z|00169|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) (xid=0x3893): OFPMMFC_INVALID_METER OFPT_METER_MOD (OF1.5) (xid=0x3893): ADD meter=2853 kbps stats bands= type=drop rate=250000 The OVS bridges also cannot be added to the system manually but OVS adds them to the database even though the operation fails: ovs-vswitchd.log --- 2022-06-29T12:12:51.874Z|00071|bridge|INFO|bridge br-int: added interface br-int on port 65534 2022-06-29T12:12:51.877Z|00072|bridge|INFO|bridge br-int: using datapath ID 000026586eff73c0 2022-06-29T12:12:51.877Z|00073|connmgr|INFO|br-int: added service controller "punix:/var/run/openvswitch/br-int.mgmt" 2022-06-29T12:13:16.157Z|00074|bridge|INFO|bridge br-int: deleted interface br-int on port 65534 2022-06-29T12:13:31.074Z|00075|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:13:31.074Z|00076|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:13:31.074Z|00077|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:13:31.074Z|00078|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory 2022-06-29T12:13:31.074Z|00079|ofproto|ERR|failed to open datapath br-int: No such file or directory 2022-06-29T12:13:31.074Z|00080|bridge|ERR|failed to create bridge br-int: No such file or directory 2022-06-29T12:15:32.189Z|00081|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:15:32.189Z|00082|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:15:32.189Z|00083|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:15:32.189Z|00084|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory 2022-06-29T12:15:32.189Z|00085|ofproto|ERR|failed to open datapath br-int: No such file or directory 2022-06-29T12:15:32.189Z|00086|bridge|ERR|failed to create bridge br-int: No such file or directory 2022-06-29T12:18:43.488Z|00087|netlink_socket|INFO|netlink dump request error (No such file or directory) 2022-06-29T12:18:43.488Z|00088|dpif|WARN|failed to enumerate system datapaths: No such file or directory 2022-06-29T12:18:43.488Z|00089|dpif|WARN|failed to create datapath ovs-system: No such file or directory 2022-06-29T12:18:43.488Z|00090|ofproto_dpif|ERR|failed to open datapath of type system: No such file or directory Mitigation steps: The only way currently to bring it back to operation, that we've found, without rebooting the host is to reload the kernel module. To do this we have to back up the ovsdb conf.db, start all the containers and delete the bridges so that we can reload the openvswitch module. Then restore the conf.db and start back the containers sequentially. Openvswitch reports compatibility for 2.16.x between kernel versions 3.16 to 5.8 Ref: https://docs.openvswitch.org/en/latest/faq/releases/
2022-07-01 14:17:50 Chris Valean bug added subscriber Chris Valean
2022-07-01 14:31:44 Rafael Coman bug added subscriber Rafael Coman