[AEP-BUG] Critical:concurrent invocations of ndctl can cause linux panic

Bug #1834119 reported by quanxian
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
intel
Fix Released
Critical
Unassigned
linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

Description:

Patch: fix patch in the libnvdimm pending tree.
Proposed fixes here: https://lists.01.org/pipermail/linux-nvdimm/2019-June/021847.html
and pushed out to libnvdimm-pending: https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending

Upstream Bug link: https://github.com/pmem/ndctl/issues/96

The problem is fairly easy to reproduce in as little as 10 minutes.
Do the following in parallel, like in separate terminals. Example...
in term #1, #3, #5, type
while [1]; do ndctl create-namespace -m devdax -s 48G done
in term #2, #4, #6, type
while [1]; do ndctl destroy-namespace all -f done

Even simple invocation will eventually lead to a panic, it can take hours though. Example...
in term #1 run the script
#/bin/bash
while /bin/true
do
ndctl destroy-namespace -f all
date
for R in ndctl list -R | jq -r ".[] | .dev"
do
for i in {1..10}
do
ndctl create-namespace -r $R -s 8g -m devdax
done
done
done
in term #2 type
while /bin/true; do ndctl list done

Run that same terminal #1 script in 2 separate terminals, thereby creating 2 separate threads that will destroy/create will usually result in a panic within an hour.

Target Kernel: 5.3
Target Release: 19.10

Revision history for this message
quanxian (quanxian-wang) wrote :

Kernel/User Patches

Dan Williams (6):
      drivers/base: Introduce kill_device()
      libnvdimm/bus: Prevent duplicate device_unregister() calls
      libnvdimm/region: Register badblocks before namespaces
      libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
      libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
      driver-core, libnvdimm: Let device subsystems add local lockdep coverage

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Is there a user space patch to be applied to the ndctl source here? I didn't see it, but maybe I missed it. If not, then this bug should be against the kernel ("linux" in launchpad), and not ndctl.

Revision history for this message
quanxian (quanxian-wang) wrote :

all are kernel patches. I have changed it kernel. Thanks for your reminder.

affects: ndctl (Ubuntu) → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1834119

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: eoan
Revision history for this message
quanxian (quanxian-wang) wrote :

87a30e1f05d7 driver-core, libnvdimm: Let device subsystems add local lockdep coverage
ca6bf264f6d8 libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
b70d31d054ee libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
6de5d06e657a libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
700cd033a82d libnvdimm/region: Register badblocks before namespaces
8aac0e233891 libnvdimm/bus: Prevent duplicate device_unregister() calls

description: updated
Changed in intel:
status: New → Fix Committed
quanxian (quanxian-wang)
Changed in intel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.