[SRU] IO gets stuck on 5.19 kernel

Bug #2022318 reported by gerald.yang
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
High
gerald.yang
Kinetic
Won't Fix
High
gerald.yang

Bug Description

[ Impact ]

When running fio on a NVME on an AWS test instance with 5.19 kernel, IOs get stuck and fio never ends

fio command:
sudo fio --name=read_iops_test --filename=/dev/nvme1n1 --filesize=50G --time_based --ramp_time=2s --runtime=1m --ioengine=libaio --direct=1 --verify=0 --randrepeat=0 --bs=16K --iodepth=256 --rw=randread
read_iops_test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=256
fio-3.28
Starting 1 process
Jobs: 1 (f=0): [/(1)][-.-%][eta 01m:02s]

IOs completely get stuck, after a while kernel log shows:
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.230970] INFO: task fio:2545 blocked for more than 120 seconds.
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.232878] Not tainted 5.19.0-43-generic #44~22.04.1-Ubuntu
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.234738] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237053] task:fio state:D stack: 0 pid: 2545 ppid: 2495 flags:0x00000002
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237057] Call Trace:
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237058] <TASK>
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237061] __schedule+0x257/0x5d0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237066] schedule+0x68/0x110
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237068] io_schedule+0x46/0x80
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237069] blk_mq_get_tag+0x117/0x300
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237072] ? destroy_sched_domains_rcu+0x40/0x40
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237076] __blk_mq_alloc_requests+0xc4/0x1e0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237079] blk_mq_get_new_requests+0xf6/0x1a0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237080] blk_mq_submit_bio+0x1eb/0x440
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237082] __submit_bio+0x109/0x1a0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237085] submit_bio_noacct_nocheck+0xc2/0x120
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237087] submit_bio_noacct+0x209/0x590
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237088] submit_bio+0x40/0xf0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237090] __blkdev_direct_IO_async+0x146/0x1f0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237092] blkdev_direct_IO.part.0+0x40/0xa0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237093] blkdev_read_iter+0x9f/0x1a0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237094] aio_read+0xec/0x1d0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237097] ? __io_submit_one.constprop.0+0x113/0x200
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237099] __io_submit_one.constprop.0+0x113/0x200
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237100] ? __io_submit_one.constprop.0+0x113/0x200
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237101] io_submit_one+0xe8/0x3d0
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237103] __x64_sys_io_submit+0x84/0x190
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237104] ? do_syscall_64+0x69/0x90
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237106] ? do_syscall_64+0x69/0x90
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237107] do_syscall_64+0x59/0x90
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237108] ? syscall_exit_to_user_mode+0x2a/0x50
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237110] ? do_syscall_64+0x69/0x90
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237111] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237113] RIP: 0033:0x7f44f351ea3d
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237116] RSP: 002b:00007fff1dcfe558 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237117] RAX: ffffffffffffffda RBX: 00007f44f2272b68 RCX: 00007f44f351ea3d
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237118] RDX: 000056315d9ad828 RSI: 0000000000000001 RDI: 00007f44f224f000
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237119] RBP: 00007f44f224f000 R08: 00007f44e9430000 R09: 00000000000002d8
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R13: 0000000000000000 R14: 000056315d9ad828 R15: 000056315d9e1830
Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237122] </TASK>

This issue can not be reproduced on 5.15 and 6.2 kernels

From call trace, it got stuck more than 120 seconds waiting on previous IOs to complete for freeing some tags, so new IO requests are able to obtain tags
But in fact, not all previous IOs got stuck, at least some of previous IOs should have completed, but the waiters were not waken up

This issue is fixed by the upstream commit below which has been merged in kernel 6.1
commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e
Author: Keith Busch <email address hidden>
Date: Fri Sep 9 11:40:22 2022 -0700

    sbitmap: fix batched wait_cnt accounting

    Batched completions can clear multiple bits, but we're only decrementing
    the wait_cnt by one each time. This can cause waiters to never be woken,
    stalling IO. Use the batched count instead.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679
    Signed-off-by: Keith Busch <email address hidden>
    Link: https://<email address hidden>
    Signed-off-by: Jens Axboe <email address hidden>

This commit can not be cherry picked cleanly, so we also need to SRU all its dependencies and one further fix below:

30514bd2dd4e sbitmap: fix lockup while swapping
4acb83417cad sbitmap: fix batched wait_cnt accounting
c35227d4e8cb sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch
48c033314f37 sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up()
ddbfc34fcf5d sbitmap: remove unnecessary code in __sbitmap_queue_get_batch
040b83fcecfb sbitmap: fix possible io hung due to lost wakeup

[ Test Plan ]

This can be simply reproduced by launching an instance on AWS EC2
And run the fio command on a nvme device for few hours to make sure IOs don’t get stuck

I’ve built a test kernel with above commits on 5.19.0-44 generic kernel here:
https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic

With this test kernel, fio has been running for few hours without any issue

[ Where problems could occur ]

The sbitmap is mainly used in blk-mq in block layer, scsi drivers and fungible ethernet driver
If there is any issue happens in sbitmap, the symptom should be IO hung, or packets get stuck in fungible driver

Changed in linux (Ubuntu):
status: New → In Progress
importance: Undecided → High
assignee: nobody → gerald.yang (gerald-yang-tw)
description: updated
Changed in linux (Ubuntu Kinetic):
status: New → In Progress
importance: Undecided → High
assignee: nobody → gerald.yang (gerald-yang-tw)
description: updated
description: updated
tags: added: sts sts-sru-needed
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Kinetic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.19.0-47.49 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux verification-needed-kinetic
Revision history for this message
Utkarsh Gupta (utkarsh) wrote :

Ubuntu 22.10 (Kinetic Kudu) has reached end of life, so this bug will not be fixed for that specific release.

Changed in linux (Ubuntu Kinetic):
status: Fix Committed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.