(b) Regarding kernel, I found some oddities too [ output of dmesg | grep -v "audit\|apparmor" ]:
GOOD: <suppressed non-related intermixed entries, like ext4's>
[ 1627.732924] hv_netvsc 00224840-7fbf-0022-4840-7fbf00224840 eth0: VF slot 1 added
[ 1627.733637] hv_pci fb9ea909-d0dd-41b6-a1c2-98b1233e987d: PCI VMBus probing: Using version 0x10002
[ 1627.742469] hv_pci fb9ea909-d0dd-41b6-a1c2-98b1233e987d: PCI host bridge to bus d0dd:00
[ 1627.742472] pci_bus d0dd:00: root bus resource [mem 0xfe0000000-0xfe00fffff window]
[ 1627.743208] pci d0dd:00:02.0: [15b3:1016] type 00 class 0x020000
[ 1627.747302] pci d0dd:00:02.0: reg 0x10: [mem 0xfe0000000-0xfe00fffff 64bit pref]
[ 1627.825728] pci d0dd:00:02.0: enabling Extended Tags
[ 1627.830106] pci d0dd:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at d0dd:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[ 1627.834774] pci d0dd:00:02.0: BAR 0: assigned [mem 0xfe0000000-0xfe00fffff 64bit pref]
[ 1627.926227] mlx5_core d0dd:00:02.0: firmware version: 14.25.8362
[ 1627.936526] mlx5_core d0dd:00:02.0: handle_hca_cap:526:(pid 619): log_max_qp value in current profile is 18, changing it to HCA capability limit (12)
[ 1628.131452] mlx5_core d0dd:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 1628.256107] hv_netvsc 00224840-7fbf-0022-4840-7fbf00224840 eth0: VF registering: eth1
[ 1628.256147] mlx5_core d0dd:00:02.0 eth1: joined to eth0
[ 1628.257059] mlx5_core d0dd:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 1628.266117] mlx5_core d0dd:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 1628.266803] mlx5_core d0dd:00:02.0 enP53469s1np0: renamed from eth1
[ 1628.305588] mlx5_core d0dd:00:02.0 enP53469s1np0: Disabling LRO, not supported in legacy RQ
[ 1628.444056] mlx5_core d0dd:00:02.0 enP53469s1np0: Link up
[ 1628.445592] hv_netvsc 00224840-7fbf-0022-4840-7fbf00224840 eth0: Data path switched to VF: enP53469s1np0
BAD:
[ 5.211059] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF slot 1 added
[ 5.211618] hv_pci e94e34d7-6e53-4ab5-95d1-4328102a7c87: PCI VMBus probing: Using version 0x10002
[ 5.220736] hv_pci e94e34d7-6e53-4ab5-95d1-4328102a7c87: PCI host bridge to bus 6e53:00
[ 5.220739] pci_bus 6e53:00: root bus resource [mem 0xfe0000000-0xfe00fffff window]
[ 5.221465] pci 6e53:00:02.0: [15b3:1016] type 00 class 0x020000
[ 5.239844] pci 6e53:00:02.0: reg 0x10: [mem 0xfe0000000-0xfe00fffff 64bit pref]
[ 5.317841] pci 6e53:00:02.0: enabling Extended Tags
[ 5.322168] pci 6e53:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 6e53:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[ 5.325766] pci 6e53:00:02.0: BAR 0: assigned [mem 0xfe0000000-0xfe00fffff 64bit pref]
[ 5.412415] mlx5_core 6e53:00:02.0: firmware version: 14.25.8102
[ 5.424153] mlx5_core 6e53:00:02.0: handle_hca_cap:526:(pid 7): log_max_qp value in current profile is 18, changing it to HCA capability limit (12)
[ 5.613614] mlx5_core 6e53:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 5.727686] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF registering: eth1
[ 5.727714] mlx5_core 6e53:00:02.0 eth1: joined to eth0
[ 5.728473] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 5.740989] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 5.741670] mlx5_core 6e53:00:02.0 enP28243s1np0: renamed from eth1
[ 5.785980] mlx5_core 6e53:00:02.0 enP28243s1np0: Disabling LRO, not supported in legacy RQ
[ 8.790213] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
[ 8.791599] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: Data path switched to VF: enP28243s1np0
[ 8.792382] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 9.759956] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: Data path switched from VF: enP28243s1np0
[ 10.388005] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
[ 10.389894] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: Data path switched to VF: enP28243s1np0
[...]
[79404.014695] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: Data path switched from VF: enP28243s1np0
[79404.015275] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF unregistering: enP28243s1np0
[79404.988951] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF slot 1 removed
[79536.467267] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF slot 1 added
[79536.467725] hv_pci e94e34d7-6e53-4ab5-95d1-4328102a7c87: PCI VMBus probing: Using version 0x10002
[79536.477073] hv_pci e94e34d7-6e53-4ab5-95d1-4328102a7c87: PCI host bridge to bus 6e53:00
[79536.477075] pci_bus 6e53:00: root bus resource [mem 0xfe0000000-0xfe00fffff window]
[79536.477816] pci 6e53:00:02.0: [15b3:1016] type 00 class 0x020000
[79536.491917] pci 6e53:00:02.0: reg 0x10: [mem 0xfe0000000-0xfe00fffff 64bit pref]
[79536.883663] pci 6e53:00:02.0: enabling Extended Tags
[79536.889425] pci 6e53:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 6e53:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[79536.893546] pci 6e53:00:02.0: BAR 0: assigned [mem 0xfe0000000-0xfe00fffff 64bit pref]
[79536.899810] mlx5_core 6e53:00:02.0: firmware version: 14.25.8102
[79537.097715] mlx5_core 6e53:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[79537.245178] hv_netvsc 000d3ad6-1871-000d-3ad6-1871000d3ad6 eth0: VF registering: eth1
[79537.245230] mlx5_core 6e53:00:02.0 eth1: joined to eth0
[79537.246141] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[79537.251960] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[79537.253846] mlx5_core 6e53:00:02.0 enP28243s1np0: renamed from eth1
[79537.287953] mlx5_core 6e53:00:02.0 enP28243s1np0: Disabling LRO, not supported in legacy RQ
[79537.429864] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
Two interesting observations:
(1) The first "Link up" message took > 3 seconds in the BAD case, whereas it took < 150ms in the good case.
(2) In the BAD case we have a "datapath switch" event:
eth0: Data path switched to VF: enP28243s1np0
eth0: Data path switched from VF: enP28243s1np0
eth0: Data path switched to VF: enP28243s1np0
This is triggered in the hyper-v netvsc driver, in function netvsc_vf_changed() - see the comment in the function:
/* Change the data path when VF UP/DOWN/CHANGE are detected.
*
* Typically a UP or DOWN event is followed by a CHANGE event, so
* net_device_ctx->data_path_is_vf is used to cache the current data path
* to avoid the duplicate call of netvsc_switch_datapath() and the duplicate
* message.
The event itself comes from a notifier_call mechanism.
(3) We have an event like a hotplug long after the first apparent network setup:
This happens in netvsc_send_vf(), which seems to be another type of packet processing (need to study more, seems related to something called inband messages, a raw packet communication).
Items (2) and (3) point to some events on hypervisor level, in my understanding.
Also, I'm curious if we could try to reproduce in an image running Hirsute kernel 5.11 and in Groovy (running its native 5.8 kernel) - here the issue seems to have been observed in Hirsute images running kernel 5.8.
(b) Regarding kernel, I found some oddities too [ output of dmesg | grep -v "audit\|apparmor" ]:
GOOD: <suppressed non-related intermixed entries, like ext4's>
[ 1627.732924] hv_netvsc 00224840- 7fbf-0022- 4840-7fbf002248 40 eth0: VF slot 1 added d0dd-41b6- a1c2-98b1233e98 7d: PCI VMBus probing: Using version 0x10002 d0dd-41b6- a1c2-98b1233e98 7d: PCI host bridge to bus d0dd:00 0xfe00fffff window] 0xfe00fffff 64bit pref] 0xfe00fffff 64bit pref] hca_cap: 526:(pid 619): log_max_qp value in current profile is 18, changing it to HCA capability limit (12) 7fbf-0022- 4840-7fbf002248 40 eth0: VF registering: eth1 7fbf-0022- 4840-7fbf002248 40 eth0: Data path switched to VF: enP53469s1np0
[ 1627.733637] hv_pci fb9ea909-
[ 1627.742469] hv_pci fb9ea909-
[ 1627.742472] pci_bus d0dd:00: root bus resource [mem 0xfe0000000-
[ 1627.743208] pci d0dd:00:02.0: [15b3:1016] type 00 class 0x020000
[ 1627.747302] pci d0dd:00:02.0: reg 0x10: [mem 0xfe0000000-
[ 1627.825728] pci d0dd:00:02.0: enabling Extended Tags
[ 1627.830106] pci d0dd:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at d0dd:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[ 1627.834774] pci d0dd:00:02.0: BAR 0: assigned [mem 0xfe0000000-
[ 1627.926227] mlx5_core d0dd:00:02.0: firmware version: 14.25.8362
[ 1627.936526] mlx5_core d0dd:00:02.0: handle_
[ 1628.131452] mlx5_core d0dd:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 1628.256107] hv_netvsc 00224840-
[ 1628.256147] mlx5_core d0dd:00:02.0 eth1: joined to eth0
[ 1628.257059] mlx5_core d0dd:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 1628.266117] mlx5_core d0dd:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 1628.266803] mlx5_core d0dd:00:02.0 enP53469s1np0: renamed from eth1
[ 1628.305588] mlx5_core d0dd:00:02.0 enP53469s1np0: Disabling LRO, not supported in legacy RQ
[ 1628.444056] mlx5_core d0dd:00:02.0 enP53469s1np0: Link up
[ 1628.445592] hv_netvsc 00224840-
BAD: 1871-000d- 3ad6-1871000d3a d6 eth0: VF slot 1 added 6e53-4ab5- 95d1-4328102a7c 87: PCI VMBus probing: Using version 0x10002 6e53-4ab5- 95d1-4328102a7c 87: PCI host bridge to bus 6e53:00 0xfe00fffff window] 0xfe00fffff 64bit pref] 0xfe00fffff 64bit pref] hca_cap: 526:(pid 7): log_max_qp value in current profile is 18, changing it to HCA capability limit (12) 1871-000d- 3ad6-1871000d3a d6 eth0: VF registering: eth1 1871-000d- 3ad6-1871000d3a d6 eth0: Data path switched to VF: enP28243s1np0 NETDEV_ CHANGE) : eth0: link becomes ready 1871-000d- 3ad6-1871000d3a d6 eth0: Data path switched from VF: enP28243s1np0 1871-000d- 3ad6-1871000d3a d6 eth0: Data path switched to VF: enP28243s1np0 1871-000d- 3ad6-1871000d3a d6 eth0: Data path switched from VF: enP28243s1np0 1871-000d- 3ad6-1871000d3a d6 eth0: VF unregistering: enP28243s1np0 1871-000d- 3ad6-1871000d3a d6 eth0: VF slot 1 removed 1871-000d- 3ad6-1871000d3a d6 eth0: VF slot 1 added 6e53-4ab5- 95d1-4328102a7c 87: PCI VMBus probing: Using version 0x10002 6e53-4ab5- 95d1-4328102a7c 87: PCI host bridge to bus 6e53:00 0xfe00fffff window] 0xfe00fffff 64bit pref] 0xfe00fffff 64bit pref] 1871-000d- 3ad6-1871000d3a d6 eth0: VF registering: eth1
[ 5.211059] hv_netvsc 000d3ad6-
[ 5.211618] hv_pci e94e34d7-
[ 5.220736] hv_pci e94e34d7-
[ 5.220739] pci_bus 6e53:00: root bus resource [mem 0xfe0000000-
[ 5.221465] pci 6e53:00:02.0: [15b3:1016] type 00 class 0x020000
[ 5.239844] pci 6e53:00:02.0: reg 0x10: [mem 0xfe0000000-
[ 5.317841] pci 6e53:00:02.0: enabling Extended Tags
[ 5.322168] pci 6e53:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 6e53:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[ 5.325766] pci 6e53:00:02.0: BAR 0: assigned [mem 0xfe0000000-
[ 5.412415] mlx5_core 6e53:00:02.0: firmware version: 14.25.8102
[ 5.424153] mlx5_core 6e53:00:02.0: handle_
[ 5.613614] mlx5_core 6e53:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[ 5.727686] hv_netvsc 000d3ad6-
[ 5.727714] mlx5_core 6e53:00:02.0 eth1: joined to eth0
[ 5.728473] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 5.740989] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[ 5.741670] mlx5_core 6e53:00:02.0 enP28243s1np0: renamed from eth1
[ 5.785980] mlx5_core 6e53:00:02.0 enP28243s1np0: Disabling LRO, not supported in legacy RQ
[ 8.790213] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
[ 8.791599] hv_netvsc 000d3ad6-
[ 8.792382] IPv6: ADDRCONF(
[ 9.759956] hv_netvsc 000d3ad6-
[ 10.388005] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
[ 10.389894] hv_netvsc 000d3ad6-
[...]
[79404.014695] hv_netvsc 000d3ad6-
[79404.015275] hv_netvsc 000d3ad6-
[79404.988951] hv_netvsc 000d3ad6-
[79536.467267] hv_netvsc 000d3ad6-
[79536.467725] hv_pci e94e34d7-
[79536.477073] hv_pci e94e34d7-
[79536.477075] pci_bus 6e53:00: root bus resource [mem 0xfe0000000-
[79536.477816] pci 6e53:00:02.0: [15b3:1016] type 00 class 0x020000
[79536.491917] pci 6e53:00:02.0: reg 0x10: [mem 0xfe0000000-
[79536.883663] pci 6e53:00:02.0: enabling Extended Tags
[79536.889425] pci 6e53:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 6e53:00:02.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[79536.893546] pci 6e53:00:02.0: BAR 0: assigned [mem 0xfe0000000-
[79536.899810] mlx5_core 6e53:00:02.0: firmware version: 14.25.8102
[79537.097715] mlx5_core 6e53:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0)
[79537.245178] hv_netvsc 000d3ad6-
[79537.245230] mlx5_core 6e53:00:02.0 eth1: joined to eth0
[79537.246141] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[79537.251960] mlx5_core 6e53:00:02.0 eth1: Disabling LRO, not supported in legacy RQ
[79537.253846] mlx5_core 6e53:00:02.0 enP28243s1np0: renamed from eth1
[79537.287953] mlx5_core 6e53:00:02.0 enP28243s1np0: Disabling LRO, not supported in legacy RQ
[79537.429864] mlx5_core 6e53:00:02.0 enP28243s1np0: Link up
Two interesting observations:
(1) The first "Link up" message took > 3 seconds in the BAD case, whereas it took < 150ms in the good case.
(2) In the BAD case we have a "datapath switch" event:
eth0: Data path switched to VF: enP28243s1np0
eth0: Data path switched from VF: enP28243s1np0
eth0: Data path switched to VF: enP28243s1np0
This is triggered in the hyper-v netvsc driver, in function netvsc_vf_changed() - see the comment in the function: ctx->data_ path_is_ vf is used to cache the current data path switch_ datapath( ) and the duplicate
/* Change the data path when VF UP/DOWN/CHANGE are detected.
*
* Typically a UP or DOWN event is followed by a CHANGE event, so
* net_device_
* to avoid the duplicate call of netvsc_
* message.
The event itself comes from a notifier_call mechanism.
(3) We have an event like a hotplug long after the first apparent network setup:
[...] 1871-000d- 3ad6-1871000d3a d6 eth0: VF slot 1 removed 1871-000d- 3ad6-1871000d3a d6 eth0: VF slot 1 added
hv_netvsc 000d3ad6-
hv_netvsc 000d3ad6-
[...]
This happens in netvsc_send_vf(), which seems to be another type of packet processing (need to study more, seems related to something called inband messages, a raw packet communication).
Items (2) and (3) point to some events on hypervisor level, in my understanding.
Also, I'm curious if we could try to reproduce in an image running Hirsute kernel 5.11 and in Groovy (running its native 5.8 kernel) - here the issue seems to have been observed in Hirsute images running kernel 5.8.
Thanks!