Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch.
The result of the bug is the kernel will not boot.
Cumulative diff:
```
> git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/
diff --git a/drivers/
index 0351f86494f1.
--- a/drivers/
+++ b/drivers/
@@ -3319,6 +3319,8 @@ static int virtnet_
}
}
+ _virtnet_
+
/* serialize netdev register + virtio_
@@ -3339,8 +3341,6 @@ static int virtnet_
}
- virtnet_
-
/* Assume link up if device can't report link status,
```
Blamed Commit:
```
commit 5e0545ef5682562
Author: Jason Wang <email address hidden>
Date: Tue Jul 25 03:20:49 2023 -0400
virtio-net: fix race between set queues and probe
BugLink: https:/
commit 25266128fe16d56
A race were found where set_channels could be called after registering
but before virtnet_
moving the virtnet_
it, use _virtnet_
not even registered at that time.
Cc: <email address hidden>
Fixes: a220871be66f ("virtio-net: correctly enable multiqueue")
Signed-off-by: Jason Wang <email address hidden>
Acked-by: Michael S. Tsirkin <email address hidden>
Reviewed-by: Xuan Zhuo <email address hidden>
Link: https://<email address hidden>
Signed-off-by: Jakub Kicinski <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Signed-off-by: Stefan Bader <email address hidden>
```
Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as:
```
commit 431db3f48c28646
Author: Jason Wang <email address hidden>
Date: Wed Aug 9 23:12:56 2023 -0400
virtio-net: set queues after driver_ok
commit 51b813176f098ff
Commit 25266128fe16 ("virtio-net: fix race between set queues and
probe") tries to fix the race between set queues and probe by calling
_virtnet_
spec. Fixing this by setting queues after virtio_
Note that rtnl needs to be held for userspace requests to change the
number of queues. So we are serialized in this way.
Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe")
Reported-by: Dragos Tatulea <email address hidden>
Acked-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: Jason Wang <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
```
Boot stack trace:
```
[ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s!
[systemd-udevd:165]
[ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+)
net_failover virtio_rng failover virtio_blk
[ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted
5.15.0-89-generic https:/
[ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0
[ 28.132479] RIP: 0010:virtnet_
[ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8
6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3
90 <48> 8b 7b 08 48 8d b5 6c ff ff ff e8 f5 71 c1 d8 48 85 c0 74 dc 48
[ 28.134326] RSP: 0018:ffff9b0c40
[ 28.134720] RAX: 0000000000000000 RBX: ffff89dfc0d13980 RCX:
0000000000000a20
[ 28.135252] RDX: 0000000000000000 RSI: ffff9b0c4064f9bc RDI:
ffff89dfc7cc00c0
[ 28.135787] RBP: ffff9b0c4064fa50 R08: 0000000000000001 R09:
0000000000000003
[ 28.136316] R10: 0000000000000003 R11: 0000000000000002 R12:
ffff9b0c4064f9e0
[ 28.136851] R13: 0000000000000002 R14: 0000000000000004 R15:
ffff89dfc0c49400
[ 28.137381] FS: 00007feeba10e8c
knlGS:000000000
[ 28.137981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.138408] CR2: 00007feeba90a0f8 CR3: 0000000100258000 CR4:
0000000000350ee0
[ 28.138940] Call Trace:
[ 28.139129] <IRQ>
[ 28.139291] ? show_trace_
[ 28.139627] ? show_trace_
[ 28.139957] ? _virtnet_
[ 28.140369] ? show_regs.
[ 28.140672] ? show_regs.
[ 28.140950] ? watchdog_
[ 28.141273] ? lockup_
[ 28.141657] ? __hrtimer_
[ 28.142011] ? clockevents_
[ 28.142377] ? hrtimer_
[ 28.142698] ? __sysvec_
[ 28.143084] ? sysvec_
[ 28.143460] </IRQ>
```
summary: |
- Ubuntu-5.15.0-89.99 breaks virtio-net spec compatibility + Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot |
description: | updated |
description: | updated |
description: | updated |
Changed in linux (Ubuntu): | |
status: | New → Fix Released |
Changed in linux (Ubuntu Jammy): | |
status: | New → Fix Committed |
It looks like the -proposed kernel has picked up that missing patch:
~/c/linux ((Ubuntu- 5.15.0- 91.101) )> git log --grep='virtio-net: set queues after driver_ok' b2bc2fae32323c0 a1510c7656
commit c6c83b9055f44bc
Author: Jason Wang <email address hidden>
Date: Wed Aug 9 23:12:56 2023 -0400
virtio-net: set queues after driver_ok
BugLink: https:/ /bugs.launchpad .net/bugs/ 2038486
commit 51b813176f098ff 61bd2833f627f53 19ead098a5 upstream.
Commit 25266128fe16 ("virtio-net: fix race between set queues and set_queues( ) before DRIVER_OK is set. This violates virtio device_ ready() .
probe") tries to fix the race between set queues and probe by calling
_virtnet_
spec. Fixing this by setting queues after virtio_
Note that rtnl needs to be held for userspace requests to change the
number of queues. So we are serialized in this way.
Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe")
Reported-by: Dragos Tatulea <email address hidden>
Acked-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: Jason Wang <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Signed-off-by: Stefan Bader <email address hidden>