[SRU Bionic][Cosmic] kernel panic in ipmi_ssif at msg_done_handler

Bug #1777716 reported by Manoj Iyer on 2018-06-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Critical
Canonical Kernel Team
Bionic
Critical
Unassigned
Cosmic
Critical
Canonical Kernel Team

Bug Description

[Impact]
When you boot bionic with Boot with i2c and ipmi_ssif enabled on Cavium ThunderX2 systems with a faulty BMC that does not return any data, and the code is trying to print the value if data[2], we get a kernel panic.

[ 484.728410] Unable to handle kernel NULL pointer dereference at virtual address 00000002
[ 484.736496] pgd = ffff0000094a2000
[ 484.739885] [00000002] *pgd=00000047fcffe003, *pud=00000047fcffd003, *pmd=0000000000000000
[ 484.748158] Internal error: Oops: 96000005 [#1] SMP
[...]
[ 485.101451] Call trace:
[...]
[ 485.188473] [<ffff000000a46e68>] msg_done_handler+0x668/0x700 [ipmi_ssif]
[ 485.195249] [<ffff000000a456b8>] ipmi_ssif_thread+0x110/0x128 [ipmi_ssif]
[ 485.202038] [<ffff0000080f1430>] kthread+0x108/0x138
[ 485.206994] [<ffff0000080838e0>] ret_from_fork+0x10/0x30
[ 485.212294] Code: aa1903e1 aa1803e0 b900227f 95fef6a5 (39400aa3)

[Test]
- System with faulty BMC
- Boot with i2c and ipmi_ssif enabled.

[Fix]
Fixed upstream with:

commit f002612b9d86613bc6fde0a444e0095225f6053e
Author: Kamlakant Patel <email address hidden>
Date: Tue Mar 13 16:32:27 2018 +0530

    ipmi_ssif: Fix kernel panic at msg_done_handler

[Regression Potential]
ipmi_ssif is only loaded on ARM64 systems, this issue is observed only on Cavium ThunderX2 systems with a faulty BMC. The fix does not impact other architectures or vendor systems. Regression potential is low.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1777716

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Manoj Iyer (manjo) on 2018-06-19
summary: - [SRU][Bionic] kernel panic in ipmi_ssif at msg_done_handler
+ [SRU Bionic][Cosmic] kernel panic in ipmi_ssif at msg_done_handler
Manoj Iyer (manjo) wrote :

Tested Bionic kernel on Cavium ThunderX2 with the patch.

ubuntu@starbuck:~$ uname -a
Linux starbuck 4.15.0-25-generic #26~lp1777716+build.1 SMP Tue Jun 19 19:25:21 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
ubuntu@starbuck:~$ dmesg | grep ipmi_ssif
[ 45.648719] ipmi_ssif: probe of dmi-ipmi-ssif.0 failed with error -17
[ 45.648772] ipmi_ssif: probe of dmi-ipmi-ssif.1 failed with error -17
[ 45.684502] ipmi_ssif: Trying SPMI-specified SSIF interface at i2c address 0xe, adapter xlp9xx-i2c, slave address 0x0
[ 45.804171] ipmi_ssif 0-000e: Found new BMC (man_id: 0x000000, prod_id: 0x0202, dev_id: 0x20)
ubuntu@starbuck:~$ lsmod | grep ssif
ipmi_ssif 40960 0
ipmi_msghandler 61440 2 ipmi_ssif,ipmi_devintf
ubuntu@starbuck:~$

Manoj Iyer (manjo) on 2018-07-10
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
status: Incomplete → Triaged
Stefan Bader (smb) on 2018-07-17
Changed in linux (Ubuntu Bionic):
importance: Undecided → Critical
status: New → In Progress
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

Verification already done by @manjo.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (4.1 KiB)

This bug was fixed in the package linux - 4.15.0-29.31

---------------
linux (4.15.0-29.31) bionic; urgency=medium

  * linux: 4.15.0-29.31 -proposed tracker (LP: #1782173)

  * [SRU Bionic][Cosmic] kernel panic in ipmi_ssif at msg_done_handler
    (LP: #1777716)
    - ipmi_ssif: Fix kernel panic at msg_done_handler

  * Update to ocxl driver for 18.04.1 (LP: #1775786)
    - misc: ocxl: use put_device() instead of device_unregister()
    - powerpc: Add TIDR CPU feature for POWER9
    - powerpc: Use TIDR CPU feature to control TIDR allocation
    - powerpc: use task_pid_nr() for TID allocation
    - ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
    - ocxl: Expose the thread_id needed for wait on POWER9
    - ocxl: Add an IOCTL so userspace knows what OCXL features are available
    - ocxl: Document new OCXL IOCTLs
    - ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()

  * Critical upstream bugfix missing in Ubuntu 18.04 - frequent Xorg crash after
    suspend (LP: #1776887)
    - ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL

  * Hard LOCKUP observed on stressing Ubuntu 18 04 (LP: #1777194)
    - powerpc: use NMI IPI for smp_send_stop
    - powerpc: Fix smp_send_stop NMI IPI handling

  * IPL: ppc64_cpu --frequency hang with INFO: rcu_sched detected stalls on
    CPUs/tasks on w34 and wsbmc016 with 920.1714.20170330n (LP: #1773964)
    - rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops

  * [Regression] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383:
    comm stress-ng: bg 4705: bad block bitmap checksum (LP: #1781709)
    - SAUCE: Revert "UBUNTU: SAUCE: ext4: fix ext4_validate_inode_bitmap: comm
      stress-ng: Corrupt inode bitmap"
    - SAUCE: ext4: check for allocation block validity with block group locked

linux (4.15.0-28.30) bionic; urgency=medium

  * linux: 4.15.0-28.30 -proposed tracker (LP: #1781433)

  * Cannot set MTU higher than 1500 in Xen instance (LP: #1781413)
    - xen-netfront: Fix mismatched rtnl_unlock
    - xen-netfront: Update features after registering netdev

linux (4.15.0-27.29) bionic; urgency=medium

  * linux: 4.15.0-27.29 -proposed tracker (LP: #1781062)

  * [Regression] EXT4-fs error (device sda1): ext4_validate_inode_bitmap:99:
    comm stress-ng: Corrupt inode bitmap (LP: #1780137)
    - SAUCE: ext4: fix ext4_validate_inode_bitmap: comm stress-ng: Corrupt inode
      bitmap

linux (4.15.0-26.28) bionic; urgency=medium

  * linux: 4.15.0-26.28 -proposed tracker (LP: #1780112)

  * failure to boot with linux-image-4.15.0-24-generic (LP: #1779827) // Cloud-
    init causes potentially huge boot delays with 4.15 kernels (LP: #1780062)
    - random: Make getrandom() ready earlier

linux (4.15.0-25.27) bionic; urgency=medium

  * linux: 4.15.0-25.27 -proposed tracker (LP: #1779354)

  * hisi_sas_v3_hw: internal task abort: timeout and not done. (LP: #1777736)
    - scsi: hisi_sas: Update a couple of register settings for v3 hw

  * hisi_sas: Add missing PHY spinlock init (LP: #1777734)
    - scsi: hisi_sas: Add missing PHY spinlock init

  * hisi_sas: improve read performance by pre-allocating slot DMA buffers
    (LP: #1777727)
    - scsi: hisi_sas: use dma_zalloc_cohe...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Cosmic):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers