Tegra "mmc0: Timeout waiting for hardware interrupt"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Kamal Mostafa | ||
Hirsute |
Fix Released
|
High
|
Kamal Mostafa |
Bug Description
On the NVIDIA (Tegra) Jetson AGX Xavier device running hirsute with any v5.11 kernel (5.11.0-13.14) the system intermittently gets the following "mmc0: Timeout" and then hangs. This generally occurs within 24 hours of uptime:
[25610.386873] mmc0: Timeout waiting for hardware interrupt.
[25610.392426] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[25610.399095] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00000505
[25610.405635] mmc0: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000
[25610.412184] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000013
[25610.418725] mmc0: sdhci: Present: 0x01fb02f6 | Host ctl: 0x00000031
[25610.425252] mmc0: sdhci: Power: 0x00000001 | Blk gap: 0x00000000
[25610.431896] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000000f
[25610.438441] mmc0: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000
[25610.444993] mmc0: sdhci: Int enab: 0x03ff000b | Sig enab: 0x03fc000b
[25610.451533] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
[25610.458074] mmc0: sdhci: Caps: 0x3f6cd08c | Caps_1: 0x18002f73
[25610.464717] mmc0: sdhci: Cmd: 0x0000083a | Max curr: 0x00000000
[25610.471268] mmc0: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x04806288
[25610.477811] mmc0: sdhci: Resp[2]: 0x314a8000 | Resp[3]: 0x00000240
[25610.484361] mmc0: sdhci: Host ctl2: 0x0000308b
[25610.489132] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000eec00200
[25610.496365] mmc0: sdhci: =======
The error and hang can be induced on demand by running this command line twice:
$ sudo cat /sys/kernel/
The first access of 'ext_csd' yields "mmc0: ADMA error: 0x02000000" and a dump, then the second access yields the timeout and hang.
All of that can be reproduced in mainline as well.
This patch from NVIDIA fixes the forced-replication method of inducing the timeout and appears to likewise fix the intermittent occurrence of it. The patch will be heading upstream in short order, per the author.