qemu-nbd kthread becomes defunct on disconnect

Bug #700165 reported by Serge Hallyn on 2011-01-07
30
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Andy Whitcroft
Natty
High
Andy Whitcroft

Bug Description

When I do
   qemu-nbd -n --connect /dev/nbd0 tmpXYZ.qcow2
I'm able to subsequently mount /dev/nbd0p1 as I expect, but the process listing then includes:

root 2753 1 0 17:47 ? 00:00:00 qemu-nbd -n --connect /dev/nbd0 tmpEXX8vS.qcow2
root 2754 2753 0 17:47 ? 00:00:00 [qemu-nbd] <defunct>

When I then try to disconnect with
   qemu-nbd -d /dev/nbd0
that process hangs, and syslog shows it is hanging on a mutex in nbd_ioctl:

Jan 7 14:47:18 localhost kernel: [24459.139094] INFO: task nbd-client:29113 blocked for more than 120 seconds.
Jan 7 14:47:18 localhost kernel: [24459.139101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 7 14:47:18 localhost kernel: [24459.139106] nbd-client D 0000000000000005 0 29113 29112 0x00000000
Jan 7 14:47:18 localhost kernel: [24459.139115] ffff8801d1215d98 0000000000000082 ffff8801d1215fd8 ffff8801d1214000
Jan 7 14:47:18 localhost kernel: [24459.139123] 0000000000013a80 ffff8801b1cd47e0 ffff8801d1215fd8 0000000000013a80
Jan 7 14:47:18 localhost kernel: [24459.139131] ffff88021f5996c0 ffff8801b1cd4440 ffffea00068737f8 ffffffffa001a020
Jan 7 14:47:18 localhost kernel: [24459.139139] Call Trace:
Jan 7 14:47:18 localhost kernel: [24459.139176] [<ffffffff815b96c7>] __mutex_lock_slowpath+0xf7/0x180
Jan 7 14:47:18 localhost kernel: [24459.139189] [<ffffffff81132e63>] ? handle_mm_fault+0x1c3/0x410
Jan 7 14:47:18 localhost kernel: [24459.139198] [<ffffffff815b911b>] mutex_lock+0x2b/0x50
Jan 7 14:47:18 localhost kernel: [24459.139211] [<ffffffffa001821c>] nbd_ioctl+0x6c/0x1c0 [nbd]
Jan 7 14:47:18 localhost kernel: [24459.139219] [<ffffffff812cb870>] blkdev_ioctl+0x230/0x730
Jan 7 14:47:18 localhost kernel: [24459.139227] [<ffffffff811966e1>] block_ioctl+0x41/0x50
Jan 7 14:47:18 localhost kernel: [24459.139234] [<ffffffff81175b43>] do_vfs_ioctl+0x93/0x370
Jan 7 14:47:18 localhost kernel: [24459.139242] [<ffffffff81164ee6>] ? vfs_write+0x126/0x190
Jan 7 14:47:18 localhost kernel: [24459.139248] [<ffffffff81175ea1>] sys_ioctl+0x81/0xa0
Jan 7 14:47:18 localhost kernel: [24459.139255] [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b

This does NOT happen with lucid userspace on lucid kernel (with recent updates, but not completely uptodate)

This does happen in plain natty.

It does happen with the latest mainline kernel-ppa.

It does happen in a lucid userspace chroot on natty kernel - almost. The qemu-nbd process does not hang in that case, but defunct qemu-nbd kthread stays around, as does the user qemu-nbd connect thread.

Marc Deslauriers (mdeslaur) wrote :

Confirmed. I'm seeing this also on natty.

Changed in linux (Ubuntu):
status: New → Confirmed
Soren Hansen (soren) wrote :

I stumbled on this last night, too.

I've identified the problem and reported it to lkml: https://lkml.org/lkml/2011/1/26/131

Stay tuned :)

tags: added: iso-testing
Kate Stewart (kate.stewart) wrote :

Reflect the importance and person working on it from Bug #711951

Changed in linux (Ubuntu Natty):
importance: Undecided → High
assignee: nobody → Andy Whitcroft (apw)
Andy Whitcroft (apw) wrote :

This bug was fixed in the package linux - 2.6.38-2.29

---------------
linux (2.6.38-2.29) natty; urgency=low

  [ Andy Whitcroft ]

  * rebase to 1f0324caefd39985e9fe052fac97da31694db31e
  * [Config] updateconfigs following rebase to
    1f0324caefd39985e9fe052fac97da31694db31e
  * rebase to 70d1f365568e0cdbc9f4ab92428e1830fdb09ab0
  * [Config] reenable HIBERNATE
    - LP: #710877
  * rebase to v2.6.38-rc3
  * [Config] reenable CONFIG_CRASH_DUMP

  [ Kamal Mostafa ]

  * SAUCE: rtl8192se: fix source file perms
  * SAUCE: rtl8192se: fix source file newline
  * SAUCE: omnibook: fix source file newline

  [ Kees Cook ]

  * [Config] packaging: really make System.map mode 0600

  [ Ricardo Salveti de Araujo ]

  * SAUCE: OMAP3630: PM: don't warn the user with a trace in case of
    PM34XX_ERRATUM

  [ Soren Hansen ]

  * SAUCE: nbd: Remove module-level ioctl mutex

  [ Tim Gardner ]

  * SAUCE: Disable building the ACPI debugfs source

  [ Upstream Kernel Changes ]

  * Set physical start and alignment 1M for virtual i386
    - LP: #710754

  [ Upstream Kernel Changes ]

  * rebase from v2.6.38-rc2 + c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84
    to v2.6.38-rc3
 -- Andy Whitcroft <email address hidden> Fri, 28 Jan 2011 16:30:32 +0000

Changed in linux (Ubuntu Natty):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers