Cavium ThunderX CN88XX Oops at smi_send.isra.4+0x80/0x158 [ipmi_msghandler]

Bug #1857073 reported by Sean Feole
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned

Bug Description

Series: Bionic
Kernel: 4.15.0-74.84 linux-generic

The following crash was observed while testing the proposed kernel for the 2019.12.02 SRU Cycle.
This kernel was built to include fixes for the following bugs:

  * [Regression] Bionic kernel 4.15.0-71.80 can not boot on ThunderX
    (LP: #1853326)
    - Revert "arm64: Use firmware to detect CPUs that are not affected by
      Spectre-v2"
    - Revert "arm64: Get rid of __smccc_workaround_1_hvc_*"

  * [Regression] Bionic kernel 4.15.0-71.80 can not boot on ThunderX2 and
    Kunpeng920 (LP: #1852723)
    - SAUCE: arm64: capabilities: Move setup_boot_cpu_capabilities() call to
      correct place
The following crash appears to be a NEW bug. not related to the prior bugs listed above.

system hostname: wright

Possible Cause: wright's crash possibly is caused by faulty error handling in the ipmi driver (notice this in its dmesg: [ 52.150201] ipmi_ssif 0-0012: Unable to get the device id: -5)

[ OK ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
[ OK ] FounBYZ-011FA0 efi.
         Mounting /boot/[ OK ] Mounted /boot/efi.
[ OK ] Reached target Local File Systems.
         Starting Set console font and keymap...
         Starting Create Volatile Files and Directories...
         Starting ebtables ruleset management...
         Starting AppArmor initialization...
         Starting Tell Plymouth To Write Out Runtime Data...
[ OK nsole font and keymap.
[ OK ] Started Tell Plymouth To Write Out Runtime Data.
[ OK ] Started Create Volatile Files and Directories.
         Starting Network Time Synchronization...
         Starting Update UTMP about Syst
[ OK ] Started Update UTMP utdown.
[ OK ] Started ebtables ruleset management.
[ OK ] Starnization.
[ OK ] Reached target System Time Synchronized.
[ OK ] Started AppArmor initialization.
[ 50.689136] cloud-init[1246]: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 running 'init-local' at Thu, 19 Dec 20 50.28 seconds.
[ 50.712486] cloud-init[1246]: 2019-12-19 22:40:37,893 - handlers.py[WARNING]: failed posting event: start: init-local/check-cache: attempting to read from cache [trust]
[ 50.736307] cloud-init[1246]: 2019-12-19 22:40:37,941 - handlers.py[WARNING]: failed posting event: finish: init-local/check-cache: SUCCESS: restored from cache: DataSourceMAAS [http://10.229.32.21:5248/MAAS/metadata/]
[ 51.244224] cloud-init[1246]: 2019-12-19 22:40:38,450 - handlers.py[WARNING]: failed posting event: finish: init-local: SUCCESS: searching for local datasources
[ OK ] Started Initial cloud-init job (pre-networking).
[ OK ] Reached target Network (Pre).
         Starting Network Service...
[ OK ] Started Network Service.
         Starting Wait for Network to be Configured...
         Starting Network Name Resolution...
[ 52.150201] ipmi_ssif 0-0012: Unable to get the device id: -5
[ 52.300309] Unable to handle kernel read fromvirtual address 00000018
[ 52.311284] Mem abort info:
[ 52.316895] 6000004
[ 52.322622] Exception class = DABT (current EL), IL = 32 bits
[ 52.331061] SET = 0, FnV = 0
[ 52.336639] EA = 0, S1PTW = 0
[ 52.342311] Data abort info:
[ 52.347731] ISV = 0, ISS = 0x00000004
[ 52.354131] CM = 0, WnR = 0
[ 52.359739] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000044052f71
[ 52.368909] [0000000000000018] *pgd=0000000000000000
[ 52.376522] Internal error: Oops: 96000004 [#1] SMP
[ 52.384039] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip cavium_rng_vf shpchp cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif(+) ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf nicpf ast i2c_algo_bit aes_ce_blk ttm aes_ce_cipher crc32_ce drm_kms_helper crct10dif_ce ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt sha256_arm64 fb_sys_fops thunder_bgx sha1_ce drm ahci libahci thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 52.473094] Process kworker/87:1 (pid: 674, stack limit = 0x000000004907a88f)
[ 52.483503] CPU: 87 PID: 674 Comm: kworker/87:1 Not tainted 4.15.0-74-generic #84-Ubuntu
[ 52.494893] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., BIOS 5.11 12/12/2012
[ 52.506587] Workqueue: events redo_bmc_reg [ipmi_msghandler]
[ 52.515614] pstate: 80400005 (Nzcv daif +PAN -UAO)
[ 52.523744] pc : smi_send.isra.4+0x80/0x158 [ipmi_msghandler]
[ 52.532818] lr : smi_send.isra.4+0x150/0x158 [ipmi_msghandler]
[ 52.541937] sp : ffff00001298bb10
[ 52.548500] x29: ffff00001298bb10 x28: 0000000000000020
[ 52.557000] x27: 0000000000000002 x26: 0000000000000000
[ 52.565407] x25: ffff00001298bc40 x24: ffff00001298bc38
[ 52.573731] x23: 0000000000000000 x22: 0000000000000018
[ 52.582025] x21: 0000000000000000 x20: ffff810fb5b08800
[ 52.590335] x19: ffff800fb6c80000 x18: ffffffffffffffff
[ 52.598616] x17: 0000000000000005 x16: 0000000000000000
[ 52.606884] x15: ffff000009588c08 x14: ffff810f27fd0587
[ 52.615138] x13: ffff810f27fd0586 x12: 0000000000000030
[ 52.623371] x11: 0101010101010101 x10: ffffff7f7fff7f7f
[ 52.631598] x9 : fefe800e26fbfeff x8 : ffff810fb5b08800
[ 52.639803] x7 : 0000000000001138 x6 : 000000000000125c
[ 52.647983] x5 : 000000000000017c x4 : ffff810fbcb30340
[ 52.656149] x3 : 0000000000000000 x2 : 0000000000000000
[ 52.664299] x1 : ffff810fb5b08800 x0 : ffff810fb2136800
[ 52.672460] Call trace:
[ 52.677793] smi_send.isra.4+0x80/0x158 [ipmi_msghandler]
[ 52.686044] i_ipmi_request+0x2ac/0x980 [ipmi_msghandler]
[ 52.694281] send_channel_info_cmd+0xac/0xd8 [ipmi_msghandler]
[ 52.702963] __scan_channels.isra.20+0x84/0x180 [ipmi_msghandler]
[ 52.711913] __bmc_get_device_id+0x424/0x8c8 [ipmi_msghandler]
[ 52.720570] redo_bmc_reg+0x6c/0x70 [ipmi_msghandler]
[ 52.728417] process_one_work+0x1e0/0x420
[ 52.735219] worker_thread+0x4c/0x478
[ 52.741656] kthread+0x134/0x138
[ 52.747630] ret_from_fork+0x10/0x18
[ 52.753956] Code: f908aa74 b4ffff74 f9424e60 aa1403e1 (f94002c2)
[ 52.762872] ---[ end trace 18050a754f82cbc5 ]---
[ 52.777797] ipmi_ssif: Unable to register device: error -5
[ 52.786224] ipmi_ssif 0-0012: Unable to start IPMI SSIF: -5
[ 52.794672] ipmi_ssif: probe of 0-0012 failed with error -5
[ 53.086337] IPv6: ADDRCONF(NETDEV_UP): enP6p1s0f1: link is not ready
[ OK ] Started Network Name Resolution.
[ OK ] Reached target Host and Network Name Lookups.
[ OK ] Reached target Network.
[ 54.498360] IPv6: ADDRCONF(NETDEV_UP): enP2p1s0f2: link is not ready
[ 55.910307] IPv6: ADDRCONF(NETDEV_UP): enP2p1s0f1: link is not ready
[ 56.033835] thunder-nicvf 0002:01:00.1 enP2p1s0f1: Link is Up 10000 Mbps Full duplex
[ 56.044491] IPv6: ADDRCONF(NETDEV_CHANGE): enP2p1s0f1: link becomes ready
[FAILED] Failed to start Wait for Network to be Configured.
See 'systemctl status systemd-networkd-wait-online.service' for details.
         Starting Initial cloud-init job (metadata service crawler)...
[ 175.031966] cloud-init[1664]: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 running 'init' at Thu, 19 Dec 2019 22:42:40 +0000. Up 173.86 seconds.
[ 175.052480] cloud-init[1664]: [ 175.058517] Unable to handle kernel paging request at virtual address 1605289951d00
ci-info: +++++++++++++++++++++++++++++++++++++++[ 175.070761] Mem abort info:
+Net device info++++++++++++++++++++++++++++++++[ 175.077563] ESR = 0x86000004
+++++++++
[ 175.084674] Exception class = IABT (current EL), IL = 32 bits
[ 175.092889] SET = 0, FnV = 0
[ 175.098177] EA = 0, S1PTW = 0
[ 175.103497] [0001605289951d00] address between user and kernel address ranges
[ 175.112843] Internal error: Oops: 86000004 [#2] SMP
[ 175.119914] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip cavium_rng_vf shpchp cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf nicpf ast i2c_algo_bit aes_ce_blk ttm aes_ce_cipher crc32_ce drm_kms_helper crct10dif_ce ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt sha256_arm64 fb_sys_fops thunder_bgx sha1_ce drm ahci libahci thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 175.205622] Process cloud-init (pid: 1664, stack limit = 0x00000000d0c323fa)
[ 175.205628] CPU: 7 PID: 1664 Comm: cloud-init Tainted: G D 4.15.0-74-generic #84-Ubuntu
[ 175.132227] cloud-init[1664]: ci-info: +------------+-------+-[ 175.227701] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., BIOS 5.11 12/12/2012
-----------------------------+-------------+--------+-----------[ 175.241753] pstate: 20400005 (nzCv daif +PAN -UAO)
--------+
[ 175.252088] pc : 0xb701605289951d00
[ 175.258552] lr : 0xb701605289951d00
[ 175.264983] sp : ffff00001e3abc70
[ 175.271225] x29: ffff810fb0526860 x28: 0000000000000040
[ 175.279499] x27: 0000000000000000 x26: ffff00001e3abdd0
[ 175.279506] x25: ffff810fb0526860 x24: ffff00001e3abda8
[ 175.272184] cloud-init[1664]: ci-info: | Device | Up | [ 175.296048] x23: ffff810fb05266e8 x22: ffff800fb1017300
          Address | Mask | Scope | Hw-Add[ 175.307009] x21: 0000000000000000 x20: ffff000008245ec4
ress |
[ 175.317852] x19: ffff00001e3abc80 x18: 0000ffffa03baa70
[ 175.326091] x17: 0000000000000000 x16: 0000000000000000
[ 175.334334] x15: 0000000000000000 x14: 7461646174656d2f
[ 175.342546] x13: 5341414d2f383432 x12: 353a31322e32332e
[ 175.350744] x11: 3932322e30312f2f x10: 0000000000000001
[ 175.358931] x9 : 0000000000000228 x8 : 0000000000000f0b
[ 175.367107] x7 : 0000000000000000 x6 : 000000000000001d
[ 175.356207] cloud-init[1664]: ci-info: +------------+-------+-[ 175.375239] x5 : 000000000000085f x4 : 0000800fb5acb000
-----------------------------+-------------+--------+-----------[ 175.386201] x3 : ffff00001e3abbb0 x2 : b733605289951d00
--------+
[ 175.397063] x1 : 0000000000000000 x0 : 0000000000002f0b
[ 175.405203] Call trace:
[ 175.410441] 0xb701605289951d00
[ 175.410448] Code: bad PC value
[ 175.400183] cloud-init[1664]: ci-info: | enp2p[ 175.422141] ---[ end trace 18050a754f82cbc6 ]---
1s0f1 | True | 10.229.48.12 | 255.255.0.0 | global | 8a:60:bf:62:54:ca |
[ 175.440275] cloud-init[1664]: ci-info: | enp2p1s0f1 | True | fe80::8860:bfff:fe62:54ca/64 | . | link | 8a:60:bf:62:54:ca |
[ 175.464194] cloud-init[1664]: ci-info: | enp2p1s0f2 | False | . | . | . | 8a:60:bf:62:54:cb |
[FAILED] Failed to start Initial cloud-init job (metadata service crawler).
See 'systemctl status cloud-init.service' for details.[ 175.484220]
cloud-init[1664]: ci-info: | enp6p1s0f1 | False | . | . | . | 8a:60:bf:62:54:cc |
[ OK ] Reached target Cloud-config availability.
[ 175.528600] cloud-init[1664]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
[ OK ] Reached target Network is Online.
[ 175.544297] cloud-init[1664]: ci-info: | lo | True | ::1/128 | . | host | . |
[ OK ] Reached target Remote File Systems (Pre).
[ 175.588214] [cloud-init OK [1664]: ] ci-info: +------------+-------+------------------------------+-------------+--------+-------------------+Reached target Remote File Systems.

[ 175.695153] cloud-init[1664]: ci-info: ++++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++
         Starting Availability of block devices...
[ 175.740233] cloud-init[1664]: ci-info: +-------+-------------+------------+-------------+------------+-------+
[ OK ] Reached target System Initialization.
[ 175.764287] cloud-init[1664]: ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
[ 175.800219] cloud-init[1664]: ci-info: +-------+-------------+------------+-------------+------------+-------+
         Starting LXD - unix socket.
[ 175.824213] cloud-init[1664]: ci-info: | 0 | 0.0.0.0 | 10.229.0.1 | 0.0.0.0 | enP2p1s0f1 | UG |
[ OK ] Started Message of the Day.
[ 175.848192] cloud-init[1664]: ci-info: | 1 | 10.229.0.0 | 0.0.0.0 | 255.255.0.0 | enP2p1s0f1 | U |
[ OK ] Listening on UUID daemon activation socket.
[ 175.880204] cloud-init[1664]: ci-info: +-------+-------------+------------+-------------+------------+-------+
[ OK ] Listening on D-Bus System Message Bus Socket.
[ 175.912203] cloud-init[1664]: ci-info: +++++++++++++++++++Route IPv6 info++++++++++++++++++++
[ OK ] Started Discard unused blocks once a week.
[ 175.944223] cloud-init[1664]: ci-info: +-------+-------------+---------+------------+-------+
[ OK ] Started ACPI Events Check.
[ 175.976207] cloud-init[1664]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ OK ] Reached target Paths.
[ OK ] Listening on ACPID Listen Socket.
         Starting Socket activation for snappy daemon.
[ OK ] Listening on Open-iSCSI iscsid Socket.
[ OK ] Started Daily Cleanup of Temporary Directories.
[ OK ] Reached target Timers.
[ OK ] Started Availability of block devices.
[ OK ] Listening on LXD - unix socket.
[ OK ] Listening on Socket activation for snappy daemon.
[ OK ] Reached target Sockets.
[ OK ] Reached target Basic System.
[ OK ] Started irqbalance daemon.
         Starting System Logging Service...
         Starting Snappy daemon...
         Starting Login Service...
[ OK ] Started Regular background program processing daemon.
[ OK ] Started D-Bus System Message Bus.
         Starting Accounts Service...
         Starting Permit User Sessions...
[ OK ] Started Deferred execution scheduler.
         Starting LSB: automatic crash report generation...
         Starting Dispatcher daemon for systemd-networkd...
         Starting Pollinate to seed the pseudo random number generator...
         Starting LSB: Record successful boot for GRUB...
[ OK ] Started FUSE filesystem for LXC.
[ OK ] Started Set the CPU Frequency Scaling governor.
         Starting /etc/rc.local Compatibility...
         Starting LXD - container startup/shutdown...
[ OK ] Started System Logging Service.
[ OK ] Started Permit User Sessions.
[ OK ] Started Login Service.
[ OK ] Started Unattended Upgrades Shutdown.
[ 177.831877] new mount options do not match the existing superblock, will be ignored
[ OK ] Started LSB: automatic crash report generation.
[ OK ] Started LSB: Record successful boot for GRUB.
         Starting Authorization Manager...
[ OK ] Started Dispatcher daemon for systemd-networkd.
[ 179.680787] cloud-init[1664]: ci-info: +-------+-------------+---------+------------+-------+
[ 179.696652] cloud-init[1664]: ci-info: | 1 | fe80::/64 | :: | enP2p1s0f1 | U |
[ 179.712246] cloud-init[1664]: ci-info: | 3 | local | :: | enP2p1s0f1 | U |
[ 179.732289] cloud-init[1664]: ci-info: | 4 | ff00::/8 | :: | enP2p1s0f1 | U |
[ 179.748246] cloud-init[1664]: ci-info: +-------+-------------+---------+------------+-------+
[ OK ] Started /etc/rc.local Compatibility.
[ OK ] Started Authorization Manager.
[ OK ] Started Accounts Service.
         Starting Terminate Plymouth Boot Screen...
         Starting Hold until boot process finishes up...
[ OK ] Started Terminate Plymouth Boot Screen.
[ OK ] Started Hold until boot process finishes up.
         Starting Set console scheme...
[ OK ] Started Serial Getty on ttyAMA0.
[ OK ] Started Set console scheme.
[ OK ] Created slice system-getty.slice.
[ OK ] Started Getty on tty1.
[ OK ] Reached target Login Prompts.
[ OK ] Started Snappy daemon.
         Starting Wait until snapd is fully seeded...
[ OK ] Started LXD - container startup/shutdown.
[ OK ] Started Wait until snapd is fully seeded.
         Starting Apply the settings specified in cloud-config...
[ 183.560383] cloud-init[2299]: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 running 'modules:config' at Thu, 19 Dec 2019 22:42:48 +0000. Up 182.46 seconds.

Ubuntu 18.04.3 LTS wright-kernel ttyAMA0

wright-kernel login: [ 194.049005] cloud-init[2390]: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 running 'modules:final' at Thu, 19 Dec 2019 22:42:59 +0000. Up 192.89 seconds.
[ 194.049938] cloud-init[2390]: Cloud-init v. 19.3-41-gc4735dd3-0ubuntu1~18.04.1 finished at Thu, 19 Dec 2019 22:43:00 +0000. Datasource DataSourceMAAS [http://10.229.32.21:5248/MAAS/metadata/]. Up 193.94 seconds
[ 232.761444] ------------[ cut here ]------------
[ 232.766175] kernel BUG at /build/linux-pWET3k/linux-4.15.0/fs/buffer.c:1240!
[ 232.773309] Internal error: Oops - BUG: 0 [#3] SMP
[ 232.778157] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip cavium_rng_vf shpchp cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf nicpf ast i2c_algo_bit aes_ce_blk ttm aes_ce_cipher crc32_ce drm_kms_helper crct10dif_ce ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt sha256_arm64 fb_sys_fops thunder_bgx sha1_ce drm ahci libahci thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 232.846614] Process landscape-sysin (pid: 2485, stack limit = 0x00000000f967c462)
[ 232.854187] CPU: 81 PID: 2485 Comm: landscape-sysin Tainted: G D 4.15.0-74-generic #84-Ubuntu
[ 232.864037] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., BIOS 5.11 12/12/2012
[ 232.872484] pstate: 20400085 (nzCv daIf +PAN -UAO)
[ 232.877348] pc : __find_get_block+0x2e8/0x398
[ 232.881754] lr : __getblk_gfp+0x3c/0x2a8
[ 232.885717] sp : ffff000023eeb6f0
[ 232.889067] x29: ffff000023eeb6f0 x28: 0000000000000000
[ 232.894440] x27: 0000000000000000 x26: 0000000000000000
[ 232.899813] x25: 0000000000000001 x24: 0000000000000000
[ 232.905185] x23: 0000000000000008 x22: ffff810fb7b13b80
[ 232.910558] x21: ffff810fb7b13b80 x20: 0000000000782020
[ 232.915930] x19: 0000000000001000 x18: 0000000000000001
[ 232.921302] x17: 0000000000000000 x16: 0000000000000000
[ 232.926675] x15: 0000000000000000 x14: 0000000000000021
[ 232.932047] x13: 79702e6e6f697373 x12: 6572706d6f635f2f
[ 232.937420] x11: 362e336e6f687479 x10: ffff00000972d000
[ 232.942793] x9 : 0000000000000000 x8 : ffff810fa61889c0
[ 232.948165] x7 : ffff810fa61889e0 x6 : 0000000000000000
[ 232.953538] x5 : 0000000000000005 x4 : 0000000000000020
[ 232.958911] x3 : 0000000000000008 x2 : 0000000000001000
[ 232.964283] x1 : 0000000000782020 x0 : 0000000000000080
[ 232.969657] Call trace:
[ 232.972133] __find_get_block+0x2e8/0x398
[ 232.976186] __getblk_gfp+0x3c/0x2a8
[ 232.979806] ext4_getblk+0xcc/0x1b0
[ 232.983332] ext4_bread_batch+0x78/0x1c8
[ 232.987302] ext4_find_entry+0x2d4/0x598
[ 232.991268] ext4_lookup+0xac/0x278
[ 232.994796] lookup_slow+0xac/0x190
[ 232.998323] walk_component+0x228/0x340
[ 233.002200] link_path_walk+0x2f4/0x568
[ 233.006076] path_lookupat+0x64/0x210
[ 233.009779] filename_lookup+0xa0/0x170
[ 233.013656] user_path_at_empty+0x58/0x70
[ 233.020251] vfs_statx+0x98/0x118
[ 233.026103] SyS_newfstatat+0x58/0x98
[ 233.032301] el0_svc_naked+0x30/0x34
[ 233.038368] Code: 17ffffe7 a90363b7 a9046bb9 f9002bbb (d4210000)
[ 233.046962] ---[ end trace 18050a754f82cbc7 ]---
[ 236.161175] Unable to handle kernel paging request at virtual address ffff000023eeba60
[ 236.171761] Mem abort info:
[ 236.176925] ESR = 0x96000007
[ 236.182247] Exception class = DABT (current EL), IL = 32 bits
[ 236.190393] SET = 0, FnV = 0
[ 236.195567] EA = 0, S1PTW = 0
[ 236.200727] Data abort info:
[ 236.205524] ISV = 0, ISS = 0x00000007
[ 236.211240] CM = 0, WnR = 0
[ 236.216016] swapper pgtable: 4k pages, 48-bit VAs, pgd = 00000000481b7080
[ 236.224621] [ffff000023eeba60] *pgd=0000010fffffe803, *pud=0000010fffffd803, *pmd=0000000fb3ecb003, *pte=0000000000000000
[ 236.237483] Internal error: Oops: 96000007 [#4] SMP
[ 236.244250] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip cavium_rng_vf shpchp cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf nicpf ast i2c_algo_bit aes_ce_blk ttm aes_ce_cipher crc32_ce drm_kms_helper crct10dif_ce ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt sha256_arm64 fb_sys_fops thunder_bgx sha1_ce drm ahci libahci thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 236.328158] Process landscape-sysin (pid: 2595, stack limit = 0x0000000022fb039e)
[ 236.338278] CPU: 87 PID: 2595 Comm: landscape-sysin Tainted: G D 4.15.0-74-generic #84-Ubuntu
[ 236.350758] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., BIOS 5.11 12/12/2012
[ 236.361917] pstate: 60400085 (nZCv daIf +PAN -UAO)
[ 236.369530] pc : _raw_spin_lock_irqsave+0x24/0x60
[ 236.377055] lr : add_wait_queue+0x38/0x68
[ 236.383870] sp : ffff000027da38e0
[ 236.389976] x29: ffff000027da38e0 x28: ffff810fa61889c0
[ 236.398109] x27: ffff000009588000 x26: ffff800fbc5b5740
[ 236.406252] x25: ffff0000097f0058 x24: ffff800fbc5b5740
[ 236.414753] x23: ffff0000097f05b0 x22: ffff000027da3c70
[ 236.423787] x21: ffff000023eeba60 x20: ffff000027da39d0
[ 236.432564] x19: ffff000023eeba60 x18: 0000000000000001
[ 236.441098] x17: 0000000000000000 x16: 0000000100000000
[ 236.449205] x15: 0000000000000000 x14: 0000000000000022
[ 236.457289] x13: 79702e6567617373 x12: 656d2f6c69616d65
[ 236.465391] x11: 2f362e336e6f6874 x10: ffffffff82658629
[ 236.473503] x9 : 6d652f362e62696c x8 : ffff810fa60a80c0
[ 236.481635] x7 : ffff810fa60a8780 x6 : ffff810fa60a80fd
[ 236.489762] x5 : 0000000099822a54 x4 : ffff810fa6188a70
[ 236.497891] x3 : 0000000000000001 x2 : 0000000000000000
[ 236.506023] x1 : 0000000000000000 x0 : 0000000000000080
[ 236.514140] Call trace:
[ 236.519372] _raw_spin_lock_irqsave+0x24/0x60
[ 236.526539] add_wait_queue+0x38/0x68
[ 236.533016] d_alloc_parallel+0x3d4/0x4c8
[ 236.539848] lookup_slow+0x80/0x190
[ 236.546128] walk_component+0x228/0x340
[ 236.552732] link_path_walk+0x2f4/0x568
[ 236.559281] path_lookupat+0x64/0x210
[ 236.565609] filename_lookup+0xa0/0x170
[ 236.572121] user_path_at_empty+0x58/0x70
[ 236.578795] vfs_statx+0x98/0x118
[ 236.584740] SyS_newfstatat+0x58/0x98
[ 236.591048] el0_svc_naked+0x30/0x34
[ 236.597273] Code: d503201f d53b4220 d50342df f9800271 (885ffe61)
[ 236.606070] ---[ end trace 18050a754f82cbc8 ]---
[ 250.251712] ------------[ cut here ]------------
[ 250.259050] kernel BUG at /build/linux-pWET3k/linux-4.15.0/fs/buffer.c:1240!
[ 250.268770] Internal error: Oops - BUG: 0 [#5] SMP
[ 250.276166] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip cavium_rng_vf shpchp cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf nicpf ast i2c_algo_bit aes_ce_blk ttm aes_ce_cipher crc32_ce drm_kms_helper crct10dif_ce ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt sha256_arm64 fb_sys_fops thunder_bgx sha1_ce drm ahci libahci thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 250.362487] Process sudo (pid: 2924, stack limit = 0x0000000063566212)
[ 250.371632] CPU: 28 PID: 2924 Comm: sudo Tainted: G D 4.15.0-74-generic #84-Ubuntu
[ 250.383047] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., BIOS 5.11 12/12/2012
[ 250.394016] pstate: 20400085 (nzCv daIf +PAN -UAO)
[ 250.401388] pc : __find_get_block+0x2e8/0x398
[ 250.408319] lr : __getblk_gfp+0x3c/0x2a8
[ 250.414801] sp : ffff00001ad3b810
[ 250.420668] x29: ffff00001ad3b810 x28: 0000000000000000
[ 250.428560] x27: 0000000000000000 x26: 0000000000000000
[ 250.436444] x25: 0000000000000001 x24: 0000000000000000
[ 250.444324] x23: 0000000000000008 x22: ffff810fb7b13b80
[ 250.452206] x21: ffff810fb7b13b80 x20: 0000000000782043
[ 250.460108] x19: 0000000000001000 x18: 0000ffffa7e5fa70
[ 250.468027] x17: 0000000000000000 x16: 0000000000000000
[ 250.475943] x15: 0000000000000008 x14: 0000000000000002
[ 250.483857] x13: 000000000000270f x12: 0101010101010101
[ 250.491758] x11: 0101010101010101 x10: ffff00000972d000
[ 250.499657] x9 : 0000000000000000 x8 : ffff800f9d63a480
[ 250.507552] x7 : ffff800f9d63a4a0 x6 : 0000000000000000
[ 250.515455] x5 : 0000000000000008 x4 : 0000000000000020
[ 250.523354] x3 : 0000000000000008 x2 : 0000000000001000
[ 250.531232] x1 : 0000000000782043 x0 : 0000000000000080
[ 250.539115] Call trace:
[ 250.544100] __find_get_block+0x2e8/0x398
[ 250.550658] __getblk_gfp+0x3c/0x2a8
[ 250.556776] ext4_getblk+0xcc/0x1b0
[ 250.562797] ext4_bread_batch+0x78/0x1c8
[ 250.569255] ext4_find_entry+0x2d4/0x598
[ 250.575704] ext4_lookup+0xac/0x278
[ 250.581718] lookup_open+0x2cc/0x610
[ 250.587813] do_last+0x720/0x8d8
[ 250.593553] path_openat+0xa8/0x310
[ 250.599550] do_filp_open+0x88/0x108
[ 250.605630] do_sys_open+0x1b0/0x2e8
[ 250.611703] SyS_openat+0x3c/0x50
[ 250.617513] el0_svc_naked+0x30/0x34
[ 250.623557] Code: 17ffffe7 a90363b7 a9046bb9 f9002bbb (d4210000)
[ 250.632148] ---[ end trace 18050a754f82cbc9 ]---

Sean Feole (sfeole)
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
summary: - Cavium ThunderX CN88XX Crashes : Internal error Ooops
+ Cavium ThunderX CN88XX Crashes : Internal error Ooops(Possibly IPMI
+ related)
Revision history for this message
Sean Feole (sfeole) wrote : Re: Cavium ThunderX CN88XX Crashes : Internal error Ooops(Possibly IPMI related)

Fill console output for wright

Revision history for this message
Sean Feole (sfeole) wrote :

2nd record crash by wright

Revision history for this message
dann frazier (dannf) wrote :

The description shows 2 Oops messages - one in IPMI, and one in ext4. I had marked this as a duplicate of bug 1857074 because the ext4 symptom is in both. But, per https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857074/comments/18 , the IPMI issue exists even with the fix for LP: #1857074. So let's keep this bug to track the IPMI issue.

dann frazier (dannf)
summary: - Cavium ThunderX CN88XX Crashes : Internal error Ooops(Possibly IPMI
- related)
+ Cavium ThunderX CN88XX Oops at smi_send.isra.4+0x80/0x158
+ [ipmi_msghandler]
Revision history for this message
Juerg Haefliger (juergh) wrote :

Sean, do you still see these failures with latest Bionic?

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Additional data point: I successfully deployed bionic 4.15.0-91 to dawes, a single-socket ThunderX Gigabyte system.

ubuntu@dawes:~$ uname -a
Linux dawes 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:10:16 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

Note, however, that this issue was reported on wright, which is a 2-socket ThunderX system.
Sean has the only the only other 2-socket ThunderX system (recht) reserved, so I'm not able to test with that.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

I understand that there are no further f/w updates planned for these ThunderX boards. Marking as "won't fix".

Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu Bionic):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.