[SRU] bcache deadlock during read IO in writeback mode
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Focal |
Invalid
|
Medium
|
Unassigned | ||
Jammy |
Fix Released
|
Medium
|
Unassigned |
Bug Description
SRU Justification:
[Impact]
When Random Read I/O is started with a test like -
fio --name=read_iops --directory=
or
random read-writes with a test like,
fio --filename=
traces are seen in the kernel log,
[ 4473.699902] INFO: task bcache_
[ 4474.050921] Not tainted 5.15.50-
[ 4474.350883] "echo 0 > /proc/sys/
[ 4474.731391] task:bcache_
[ 4474.731408] Call Trace:
[ 4474.731411] <TASK>
[ 4474.731413] __schedule+
[ 4474.731433] schedule+0x4e/0xb0
[ 4474.731436] rwsem_down_
[ 4474.731441] down_write+
[ 4474.731446] bch_writeback_
[ 4474.731471] ? read_dirty_
[ 4474.731487] kthread+0x12a/0x150
[ 4474.731491] ? set_kthread_
[ 4474.731494] ret_from_
[ 4474.731499] </TASK>
The bug exists till kernel 5.15.50-
The reproducer is pasted below:
# uname -a
Linux bronzor 5.15.50-
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdd 8:48 0 279.4G 0 disk
└─sdd1 8:49 0 60G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/
nvme0n1 259:0 0 372.6G 0 disk
└─nvme0n1p1 259:2 0 15G 0 part
└─bcache0 252:0 0 60G 0 disk /home/ubuntu/
fio --name=read_iops --directory=
read_iops: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
fio-3.28
Starting 1 process
read_iops: Laying out IO file (1 file / 12288MiB)
The test does not progress beyond a few minutes, and this trace is then seen in the kernel log,
[ 4473.699902] INFO: task bcache_
[ 4474.050921] Not tainted 5.15.50-
[ 4474.350883] "echo 0 > /proc/sys/
[ 4474.731391] task:bcache_
[ 4474.731408] Call Trace:
[ 4474.731411] <TASK>
[ 4474.731413] __schedule+
[ 4474.731433] schedule+0x4e/0xb0
[ 4474.731436] rwsem_down_
[ 4474.731441] down_write+
[ 4474.731446] bch_writeback_
[ 4474.731471] ? read_dirty_
[ 4474.731487] kthread+0x12a/0x150
[ 4474.731491] ? set_kthread_
[ 4474.731494] ret_from_
[ 4474.731499] </TASK>
[Fix]
These 3 fixes are needed for the SRU.
dea3560e5f31965
dc60301fb408e06
I have built these fixes into kernel 5.15.0-39-generic (jammy) and tested to verify the problem is fixed.
[Regression Potential]
Regression potential should be minimal. I have not seen any potential drawbacks or harmful effects of this fix in my testing.
CVE References
Changed in linux (Ubuntu): | |
milestone: | none → jammy-updates |
milestone: | jammy-updates → none |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in linux (Ubuntu Jammy): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Focal): | |
importance: | Undecided → Medium |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1980925
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.