Hotplugging a SATA disk into a SAS controller may cause crash

Bug #1768948 reported by dann frazier on 2018-05-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

[Impact]
Hotplugging a SATA disk into a SAS controller may trigger a NULL pointer dereference, leading to a crash:

[ 2366.923208] Unable to handle kernel NULL pointer dereference
at virtual address 000007b8
...
[ 2368.766334] Call trace:
[ 2368.781712] [<ffffffc00065c3b0>] sas_find_dev_by_rphy+0x48/0x118
[ 2368.800394] [<ffffffc00065c4a8>] sas_target_alloc+0x28/0x98
[ 2368.817975] [<ffffffc00063e920>] scsi_alloc_target+0x248/0x308
[ 2368.835570] [<ffffffc000640080>] __scsi_add_device+0xb8/0x160
[ 2368.853034] [<ffffffc0006e52d8>] ata_scsi_scan_host+0x190/0x230
[ 2368.871614] [<ffffffc0006e54b0>] ata_scsi_hotplug+0xc8/0xe8
[ 2368.889152] [<ffffffc0000da75c>] process_one_work+0x164/0x438
[ 2368.908003] [<ffffffc0000dab74>] worker_thread+0x144/0x4b0
[ 2368.924613] [<ffffffc0000e0ffc>] kthread+0xfc/0x110

[Test Case]
Unplug a SATA disk from a SAS controller and insert a new SATA disk in its place.

[Fix]
The ATA_PFLAG_SCSI_HOTPLUG flag is what causes libsas to attempt to handle hot add/remove. However, for ata devices on a SAS controller, this should be handled by libata. The solution is to not set this flag for ATA devices on a SAS controller.

[Regression Risk]
The fix is a clean cherry-pick from upstream that is tagged for stable. No subsequent patches in linux-next have a "Fixes:" marker referencing this patch, suggesting no regressions have been found since its introduction.

CVE References

dann frazier (dannf) on 2018-05-03
Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
dann frazier (dannf) on 2018-05-08
description: updated
Stefan Bader (smb) on 2018-05-23
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
dann frazier (dannf) wrote :

Smoke tested on HiSilicon D05 & D06 systems (I don't have physical access to do the hot-pull/plug)

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-23.25

---------------
linux (4.15.0-23.25) bionic; urgency=medium

  * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)

  * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
    - arm64: mmu: add the entry trampolines start/end section markers into
      sections.h
    - arm64: sdei: Add trampoline code for remapping the kernel

  * Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
    - ACPI: APEI: handle PCIe AER errors in separate function
    - ACPI: APEI: call into AER handling regardless of severity

  * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
    - scsi: qla2xxx: Fix session cleanup for N2N
    - scsi: qla2xxx: Remove unused argument from qlt_schedule_sess_for_deletion()
    - scsi: qla2xxx: Serialize session deletion by using work_lock
    - scsi: qla2xxx: Serialize session free in qlt_free_session_done
    - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
    - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
    - scsi: qla2xxx: Prevent relogin trigger from sending too many commands
    - scsi: qla2xxx: Fix double free bug after firmware timeout
    - scsi: qla2xxx: Fixup locking for session deletion

  * Several hisi_sas bug fixes (LP: #1768974)
    - scsi: hisi_sas: dt-bindings: add an property of signal attenuation
    - scsi: hisi_sas: support the property of signal attenuation for v2 hw
    - scsi: hisi_sas: fix the issue of link rate inconsistency
    - scsi: hisi_sas: fix the issue of setting linkrate register
    - scsi: hisi_sas: increase timer expire of internal abort task
    - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
    - scsi: hisi_sas: fix return value of hisi_sas_task_prep()
    - scsi: hisi_sas: Code cleanup and minor bug fixes

  * [bionic] machine stuck and bonding not working well when nvmet_rdma module
    is loaded (LP: #1764982)
    - nvmet-rdma: Don't flush system_wq by default during remove_one
    - nvme-rdma: Don't flush delete_wq by default during remove_one

  * Warnings/hang during error handling of SATA disks on SAS controller
    (LP: #1768971)
    - scsi: libsas: defer ata device eh commands to libata

  * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
    - ata: do not schedule hot plug if it is a sas host

  * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
    ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
    - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
    - powerpc/64s: return more carefully from sreset NMI
    - powerpc/64s: sreset panic if there is no debugger or crash dump handlers

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Hang on network interface removal in Xen virtual machine (LP: #1771620)
    - xen-netfront: Fix hang on device removal

  * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
    - net: hns: Avoid action name truncation

  * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
    - SAUCE: powerpc/perf: Fix memory allocation for...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers