I have successfully migrated 375+ iterations running the same testcase on the original lab that exhibited the failure previously (ip 20-27 BUILD_ID="20200115T023003Z).
However, I have compared of the original ovs-vswitchd.log and the current "working" logs.
The original logs had numa related error reported in the ovs-vswitchd.log as indicated below.
2019-12-04T19:36:13.494Z|00835|dpdk|INFO|VHOST_CONFIG: /var/run/openvswitch/vhu0ef6b42f-79: reconnecting...
2019-12-04T19:36:13.494Z|00836|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance.
2019-12-04T19:36:13.494Z|00837|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance.
2019-12-04T19:36:13.494Z|00838|dpif_netdev|INFO|Core 2 on numa node 0 assigned port 'vhuff279836-40' rx queue 0 (measured processing cycles 0).
2019-12-04T19:36:13.494Z|00839|dpif_netdev|INFO|Core 1 on numa node 0 assigned port 'vhu0ef6b42f-79' rx queue 0 (measured processing cycles 0).
...
2019-12-04T19:37:06.305Z|00889|bridge|INFO|bridge br-int: deleted interface vhuff279836-40 on port 14
2019-12-04T19:37:06.305Z|00890|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance.
2019-12-04T19:37:06.305Z|00891|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance.
2019-12-04T19:37:06.305Z|00892|dpif_netdev|INFO|Core 2 on numa node 0 assigned port 'vhu0ef6b42f-79' rx queue 0 (measured processing cycles 264156).
2019-12-04T19:37:06.343Z|00893|netdev_linux|WARN|Dropped 51 log messages in last 41 seconds (most recently, 34 seconds ago) due to excessive rate
2019-12-04T19:37:06.343Z|00894|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap7964d501-39 failed: No such device
2019-12-04T19:37:06.375Z|00895|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tapdcbc1bc9-06 failed: No such device
2019-12-04T19:37:06.407Z|00896|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap05f66e28-9b failed: No such device
2019-12-04T19:37:06.508Z|00897|bridge|INFO|bridge br-int: deleted interface vhu0ef6b42f-79 on port 15
2019-12-04T19:37:06.510Z|00898|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance.
2019-12-04T19:37:06.510Z|00899|dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance.
2019-12-04T19:37:08.746Z|00900|connmgr|INFO|br-int<->unix#3129: 1 flow_mods in the last 0 s (1 deletes)
2019-12-04T19:37:09.431Z|00901|connmgr|INFO|br-int<->unix#3132: 1 flow_mods in the last 0 s (1 deletes)
2019-12-04T19:37:09.868Z|00902|connmgr|INFO|br-int<->unix#3137: 1 flow_mods in the last 0 s (1 deletes)
2019-12-04T19:37:10.492Z|00903|connmgr|INFO|br-int<->unix#3140: 2 flow_mods in the last 0 s (2 deletes)
2019-12-04T19:37:37.817Z|00904|connmgr|INFO|br-phy0<->tcp:127.0.0.1:6633: 1 flow_mods 27 s ago (1 deletes)
2019-12-04T19:37:37.818Z|00905|connmgr|INFO|br-int<->tcp:127.0.0.1:6633: 3 flow_mods in the 4 s starting 31 s ago (3 deletes)
2019-12-04T19:50:32.212Z|00017|dpif_netdev(revalidator1076)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c8:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(src=192.168.59.55,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no))
2019-12-04T19:50:32.212Z|00018|dpif(revalidator1076)|WARN|netdev@ovs-netdev: failed to put[modify] (Invalid argument) ufid:a3ffe794-dffc-4f95-8595-5832e3bcd273 skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c8:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0/0x0),encap(eth_type(0x0800),ipv4(src=192.168.59.55/0.0.0.0,dst=224.0.0.22/0.0.0.0,proto=2/0,tos=0xc0/0,ttl=1/0,frag=no)), actions:userspace(pid=0,slow_path(match))
2019-12-04T19:50:46.751Z|00019|dpif_netdev(revalidator1076)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c9:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(src=192.168.59.54,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no))
2019-12-04T19:50:46.751Z|00020|dpif(revalidator1076)|WARN|netdev@ovs-netdev: failed to put[modify] (Invalid argument) ufid:ee243e26-98e5-4eaf-897c-88eac56e9b83 skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c9:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0/0x0),encap(eth_type(0x0800),ipv4(src=192.168.59.54/0.0.0.0,dst=224.0.0.22/0.0.0.0,proto=2/0,tos=0xc0/0,ttl=1/0,frag=no)), actions:userspace(pid=0,slow_path(match))
2019-12-04T19:51:18.803Z|00001|dpif_netdev(revalidator1085)|ERR|internal error parsing flow key skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c9:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0),encap(eth_type(0x0800),ipv4(src=192.168.59.54,dst=224.0.0.22,proto=2,tos=0xc0,ttl=1,frag=no))
2019-12-04T19:51:18.803Z|00002|dpif(revalidator1085)|WARN|netdev@ovs-netdev: failed to put[modify] (Invalid argument) ufid:ee243e26-98e5-4eaf-897c-88eac56e9b83 skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(3),packet_type(ns=0,id=0),eth(src=90:e2:ba:60:c9:20,dst=01:00:5e:00:00:16),eth_type(0x8100),vlan(vid=11,pcp=0/0x0),encap(eth_type(0x0800),ipv4(src=192.168.59.54/0.0.0.0,dst=224.0.0.22/0.0.0.0,proto=2/0,tos=0xc0/0,ttl=1/0,frag=no)), actions:userspace(pid=0,slow_path(match))
I have successfully migrated 375+ iterations running the same testcase on the original lab that exhibited the failure previously (ip 20-27 BUILD_ID= "20200115T02300 3Z).
However, I have compared of the original ovs-vswitchd.log and the current "working" logs.
The original logs had numa related error reported in the ovs-vswitchd.log as indicated below.
2019-12- 04T19:36: 13.494Z| 00835|dpdk| INFO|VHOST_ CONFIG: /var/run/ openvswitch/ vhu0ef6b42f- 79: reconnecting... 04T19:36: 13.494Z| 00836|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance. 04T19:36: 13.494Z| 00837|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance. 04T19:36: 13.494Z| 00838|dpif_ netdev| INFO|Core 2 on numa node 0 assigned port 'vhuff279836-40' rx queue 0 (measured processing cycles 0). 04T19:36: 13.494Z| 00839|dpif_ netdev| INFO|Core 1 on numa node 0 assigned port 'vhu0ef6b42f-79' rx queue 0 (measured processing cycles 0).
2019-12-
2019-12-
2019-12-
2019-12-
...
2019-12- 04T19:37: 06.305Z| 00889|bridge| INFO|bridge br-int: deleted interface vhuff279836-40 on port 14 04T19:37: 06.305Z| 00890|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance. 04T19:37: 06.305Z| 00891|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance. 04T19:37: 06.305Z| 00892|dpif_ netdev| INFO|Core 2 on numa node 0 assigned port 'vhu0ef6b42f-79' rx queue 0 (measured processing cycles 264156). 04T19:37: 06.343Z| 00893|netdev_ linux|WARN| Dropped 51 log messages in last 41 seconds (most recently, 34 seconds ago) due to excessive rate 04T19:37: 06.343Z| 00894|netdev_ linux|WARN| ethtool command ETHTOOL_GDRVINFO on network device tap7964d501-39 failed: No such device 04T19:37: 06.375Z| 00895|netdev_ linux|WARN| ethtool command ETHTOOL_GDRVINFO on network device tapdcbc1bc9-06 failed: No such device 04T19:37: 06.407Z| 00896|netdev_ linux|WARN| ethtool command ETHTOOL_GDRVINFO on network device tap05f66e28-9b failed: No such device 04T19:37: 06.508Z| 00897|bridge| INFO|bridge br-int: deleted interface vhu0ef6b42f-79 on port 15 04T19:37: 06.510Z| 00898|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 0 on port 'eth0' will be assigned to the pmd on core 1 (numa node 0). Expect reduced performance. 04T19:37: 06.510Z| 00899|dpif_ netdev| WARN|There' s no available (non-isolated) pmd thread on numa node 1. Queue 1 on port 'eth0' will be assigned to the pmd on core 2 (numa node 0). Expect reduced performance. 04T19:37: 08.746Z| 00900|connmgr| INFO|br- int<->unix# 3129: 1 flow_mods in the last 0 s (1 deletes) 04T19:37: 09.431Z| 00901|connmgr| INFO|br- int<->unix# 3132: 1 flow_mods in the last 0 s (1 deletes) 04T19:37: 09.868Z| 00902|connmgr| INFO|br- int<->unix# 3137: 1 flow_mods in the last 0 s (1 deletes) 04T19:37: 10.492Z| 00903|connmgr| INFO|br- int<->unix# 3140: 2 flow_mods in the last 0 s (2 deletes) 04T19:37: 37.817Z| 00904|connmgr| INFO|br- phy0<-> tcp:127. 0.0.1:6633: 1 flow_mods 27 s ago (1 deletes) 04T19:37: 37.818Z| 00905|connmgr| INFO|br- int<->tcp: 127.0.0. 1:6633: 3 flow_mods in the 4 s starting 31 s ago (3 deletes) 04T19:50: 32.212Z| 00017|dpif_ netdev( revalidator1076 )|ERR|internal error parsing flow key skb_priority( 0),skb_ mark(0) ,ct_state( 0),ct_zone( 0),ct_mark( 0),ct_label( 0),recirc_ id(0),dp_ hash(0) ,in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c8:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0),encap( eth_type( 0x0800) ,ipv4(src= 192.168. 59.55,dst= 224.0.0. 22,proto= 2,tos=0xc0, ttl=1,frag= no)) 04T19:50: 32.212Z| 00018|dpif( revalidator1076 )|WARN| netdev@ ovs-netdev: failed to put[modify] (Invalid argument) ufid:a3ffe794- dffc-4f95- 8595-5832e3bcd2 73 skb_priority( 0/0),skb_ mark(0/ 0),ct_state( 0/0),ct_ zone(0/ 0),ct_mark( 0/0),ct_ label(0/ 0),recirc_ id(0),dp_ hash(0/ 0),in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c8:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0/0x0), encap(eth_ type(0x0800) ,ipv4(src= 192.168. 59.55/0. 0.0.0,dst= 224.0.0. 22/0.0. 0.0,proto= 2/0,tos= 0xc0/0, ttl=1/0, frag=no) ), actions: userspace( pid=0,slow_ path(match) ) 04T19:50: 46.751Z| 00019|dpif_ netdev( revalidator1076 )|ERR|internal error parsing flow key skb_priority( 0),skb_ mark(0) ,ct_state( 0),ct_zone( 0),ct_mark( 0),ct_label( 0),recirc_ id(0),dp_ hash(0) ,in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c9:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0),encap( eth_type( 0x0800) ,ipv4(src= 192.168. 59.54,dst= 224.0.0. 22,proto= 2,tos=0xc0, ttl=1,frag= no)) 04T19:50: 46.751Z| 00020|dpif( revalidator1076 )|WARN| netdev@ ovs-netdev: failed to put[modify] (Invalid argument) ufid:ee243e26- 98e5-4eaf- 897c-88eac56e9b 83 skb_priority( 0/0),skb_ mark(0/ 0),ct_state( 0/0),ct_ zone(0/ 0),ct_mark( 0/0),ct_ label(0/ 0),recirc_ id(0),dp_ hash(0/ 0),in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c9:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0/0x0), encap(eth_ type(0x0800) ,ipv4(src= 192.168. 59.54/0. 0.0.0,dst= 224.0.0. 22/0.0. 0.0,proto= 2/0,tos= 0xc0/0, ttl=1/0, frag=no) ), actions: userspace( pid=0,slow_ path(match) ) 04T19:51: 18.803Z| 00001|dpif_ netdev( revalidator1085 )|ERR|internal error parsing flow key skb_priority( 0),skb_ mark(0) ,ct_state( 0),ct_zone( 0),ct_mark( 0),ct_label( 0),recirc_ id(0),dp_ hash(0) ,in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c9:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0),encap( eth_type( 0x0800) ,ipv4(src= 192.168. 59.54,dst= 224.0.0. 22,proto= 2,tos=0xc0, ttl=1,frag= no)) 04T19:51: 18.803Z| 00002|dpif( revalidator1085 )|WARN| netdev@ ovs-netdev: failed to put[modify] (Invalid argument) ufid:ee243e26- 98e5-4eaf- 897c-88eac56e9b 83 skb_priority( 0/0),skb_ mark(0/ 0),ct_state( 0/0),ct_ zone(0/ 0),ct_mark( 0/0),ct_ label(0/ 0),recirc_ id(0),dp_ hash(0/ 0),in_port( 3),packet_ type(ns= 0,id=0) ,eth(src= 90:e2:ba: 60:c9:20, dst=01: 00:5e:00: 00:16), eth_type( 0x8100) ,vlan(vid= 11,pcp= 0/0x0), encap(eth_ type(0x0800) ,ipv4(src= 192.168. 59.54/0. 0.0.0,dst= 224.0.0. 22/0.0. 0.0,proto= 2/0,tos= 0xc0/0, ttl=1/0, frag=no) ), actions: userspace( pid=0,slow_ path(match) )
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-
2019-12-