nvme: avoid cqe corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Kamal Mostafa | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
To address customer-reported NVMe issue with instance types (notably c5 and m5) that expose EBS volumes as NVMe devices, this commit from mainline v4.6 should be backported to Xenial:
d783e0bd02e700e
dmesg sample:
[Wed Aug 15 01:11:21 2018] nvme 0000:00:1f.0: I/O 8 QID 1 timeout, aborting
[Wed Aug 15 01:11:21 2018] nvme 0000:00:1f.0: I/O 9 QID 1 timeout, aborting
[Wed Aug 15 01:11:21 2018] nvme 0000:00:1f.0: I/O 21 QID 2 timeout, aborting
[Wed Aug 15 01:11:32 2018] nvme 0000:00:1f.0: I/O 10 QID 1 timeout, aborting
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: I/O 8 QID 1 timeout, reset controller
[Wed Aug 15 01:11:51 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:11:51 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:11:51 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 21 QID 2
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: completing aborted command with status: 0007
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887751
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887751
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 22 QID 2
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887767
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887767
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 23 QID 2
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887769
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 83887769
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 8 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 9 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: completing aborted command with status: 0007
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 41943136
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 10 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: completing aborted command with status: 0007
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 6976
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 22 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 23 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 24 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 25 QID 1
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: Cancelling I/O 2 QID 0
[Wed Aug 15 01:11:51 2018] nvme nvme1: Abort status: 0x7
[Wed Aug 15 01:11:51 2018] nvme 0000:00:1f.0: completing aborted command with status: fffffffc
[Wed Aug 15 01:11:51 2018] blk_update_request: I/O error, dev nvme1n1, sector 96
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): metadata I/O error: block 0x5000687 ("xlog_iodone") error 5 numblks 64
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_do_
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): Log I/O Error Detected. Shutting down filesystem
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): Please umount the filesystem and rectify the problem(s)
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 872, lost async page write
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_imap_to_bp: xfs_trans_
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_iunlink_remove: xfs_imap_to_bp returned error -5.
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 873, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 874, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 875, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 876, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 877, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 878, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 879, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 880, lost async page write
[Wed Aug 15 01:11:51 2018] Buffer I/O error on dev nvme1n1, logical block 881, lost async page write
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): metadata I/O error: block 0x5000697 ("xlog_iodone") error 5 numblks 64
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_do_
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): metadata I/O error: block 0x5000699 ("xlog_iodone") error 5 numblks 64
[Wed Aug 15 01:11:51 2018] XFS (nvme1n1): xfs_do_
[Wed Aug 15 01:12:20 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:12:22 2018] nvme 0000:00:1f.0: I/O 22 QID 1 timeout, aborting
[Wed Aug 15 01:12:22 2018] nvme 0000:00:1f.0: I/O 23 QID 1 timeout, aborting
[Wed Aug 15 01:12:22 2018] nvme 0000:00:1f.0: I/O 24 QID 1 timeout, aborting
[Wed Aug 15 01:12:22 2018] nvme 0000:00:1f.0: I/O 25 QID 1 timeout, aborting
[Wed Aug 15 01:12:22 2018] nvme 0000:00:1f.0: I/O 24 QID 2 timeout, aborting
[Wed Aug 15 01:12:22 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:12:22 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:12:22 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:12:22 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:12:22 2018] nvme nvme1: Abort status: 0x2
[Wed Aug 15 01:12:50 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:12:52 2018] nvme 0000:00:1f.0: I/O 22 QID 1 timeout, reset controller
[Wed Aug 15 01:13:21 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:13:51 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
[Wed Aug 15 01:14:21 2018] XFS (nvme1n1): xfs_log_force: error -5 returned.
Changed in linux (Ubuntu Xenial): | |
status: | New → In Progress |
tags: | added: kernel-da-key xenial |
Changed in linux (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
tags: | added: cscc |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- xenial' to 'verification- done-xenial' . If the problem still exists, change the tag 'verification- needed- xenial' to 'verification- failed- xenial' .
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!