I have a system that gets an increase of 2 in the error count for the 0x2002 event in the error-log every time I reboot. Its NVMe is using v1.2. Another system has an 1.4 device, and doesn't show the error. Same ubuntu release on both (mantic), same kernel.
I'll use this to troubleshoot. In the end, it looks like we have two things going on:
- something issuing wrong/invalid commands (kernel/libnvme?)
- smartd being overzealous and considering those critical errors (the linked patch should fix that).
I have a system that gets an increase of 2 in the error count for the 0x2002 event in the error-log every time I reboot. Its NVMe is using v1.2. Another system has an 1.4 device, and doesn't show the error. Same ubuntu release on both (mantic), same kernel.
I'll use this to troubleshoot. In the end, it looks like we have two things going on:
- something issuing wrong/invalid commands (kernel/libnvme?)
- smartd being overzealous and considering those critical errors (the linked patch should fix that).