Spawning multiple instances can cause race conditions with nbd

Bug #1207422 reported by Stanislaw Pitucha
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Stanislaw Pitucha
Grizzly
Fix Released
Medium
Vish Ishaya

Bug Description

Spawning a number of instances using a single request (or number of requests close enough to each other) can result in the same nbd device chosen for more than one instance. For example, see the log of a collision below. The python threads switch after the device is selected and qemu-nbd is spawned, but before the qemu-nbd gets a chance to connect and create a /sys/block/nbd*/pid file.

Solution is most likely to put a lock around the _inner_get_dev() part of NbdMount.

2013-08-01 12:18:02.893 23130 DEBUG nova.virt.disk.mount.nbd [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Get nbd device /dev/nbd14 for /var/lib/nova/instances/0fde8031-0152-49ca-bd59-fd7a9353afc0/disk _inner_get_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/nbd.py:87
2013-08-01 12:18:02.893 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf qemu-nbd -c /dev/nbd14 /var/lib/nova/instances/0fde8031-0152-49ca-bd59-fd7a9353afc0/disk execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142
2013-08-01 12:18:02.924 23130 DEBUG nova.virt.disk.mount.nbd [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Get nbd device /dev/nbd14 for /var/lib/nova/instances/10474a10-4bfa-41f2-8bd6-9add1e6aff1f/disk _inner_get_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/nbd.py:87
2013-08-01 12:18:02.924 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf qemu-nbd -c /dev/nbd14 /var/lib/nova/instances/10474a10-4bfa-41f2-8bd6-9add1e6aff1f/disk execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142
2013-08-01 12:18:02.968 23130 DEBUG nova.virt.disk.mount.api [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Map dev /dev/nbd14 map_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/api.py:135
2013-08-01 12:18:02.968 23130 DEBUG nova.virt.disk.mount.api [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Mount /dev/nbd14p1 on /tmp/openstack-vfs-localfsnW0rfa mnt_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/api.py:188
2013-08-01 12:18:02.968 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/nbd14p1 /tmp/openstack-vfs-localfsnW0rfa execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142
2013-08-01 12:18:03.000 23130 DEBUG nova.virt.disk.mount.api [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Map dev /dev/nbd14 map_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/api.py:135
2013-08-01 12:18:03.000 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf kpartx -a /dev/nbd14 execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/nbd14p1 /tmp/openstack-vfs-localfsnW0rfa
Stderr: 'mount: special device /dev/nbd14p1 does not exist\n' mnt_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/api.py:193
2013-08-01 12:18:03.044 23130 DEBUG nova.virt.disk.mount.api [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Unmap dev /dev/nbd14 unmap_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/api.py:179
2013-08-01 12:18:03.045 23130 DEBUG nova.virt.disk.mount.nbd [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Release nbd device /dev/nbd14 unget_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/nbd.py:126
2013-08-01 12:18:03.045 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf qemu-nbd -d /dev/nbd14 execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/nbd14p1 /tmp/openstack-vfs-localfsnW0rfa
Stderr: 'mount: special device /dev/nbd14p1 does not exist\n') setup /usr/lib/python2.7/dist-packages/nova/virt/disk/vfs/localfs.py:81
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/nbd14p1 /tmp/openstack-vfs-localfsnW0rfa
Stderr: 'mount: special device /dev/nbd14p1 does not exist\n')
2013-08-01 12:18:04.117 23130 DEBUG nova.virt.disk.mount.nbd [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Release nbd device /dev/nbd14 unget_dev /usr/lib/python2.7/dist-packages/nova/virt/disk/mount/nbd.py:126
2013-08-01 12:18:04.118 23130 DEBUG nova.openstack.common.processutils [req-4ed4037b-a707-4b25-9037-5cf3443f8267 10363648477724 10909817428811] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf qemu-nbd -d /dev/nbd14 execute /usr/lib/python2.7/dist-packages/nova/openstack/common/processutils.py:142

Changed in nova:
assignee: nobody → Stanislaw Pitucha (stanislaw-pitucha)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/39927

Changed in nova:
status: New → In Progress
Yaguang Tang (heut2008)
tags: added: grizzly-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/39927
Committed: http://github.com/openstack/nova/commit/e4ed9e3726b87e4f37138867f752b0588f3626a0
Submitter: Jenkins
Branch: master

commit e4ed9e3726b87e4f37138867f752b0588f3626a0
Author: Stanislaw Pitucha <email address hidden>
Date: Fri Aug 2 13:25:59 2013 +0000

    Make nbd reservation thread-safe

    Avoid the situation where two local threads choose the same nbd number
    for injecting files into the instance. If this happened quickly enough
    nova was left with one qemu-nbd hanging and blocking the device forever.

    Fixes bug 1207422

    Change-Id: I18864b19ebd30669534e45ca7a50e12f61207302

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
importance: Undecided → Medium
tags: removed: grizzly-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/grizzly)

Fix proposed to branch: stable/grizzly
Review: https://review.openstack.org/44107

Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-3
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/grizzly)

Reviewed: https://review.openstack.org/44107
Committed: http://github.com/openstack/nova/commit/bf71ab454d93759b9c3968a21c6661263e8a723b
Submitter: Jenkins
Branch: stable/grizzly

commit bf71ab454d93759b9c3968a21c6661263e8a723b
Author: Stanislaw Pitucha <email address hidden>
Date: Fri Aug 2 13:25:59 2013 +0000

    Make nbd reservation thread-safe

    Avoid the situation where two local threads choose the same nbd number
    for injecting files into the instance. If this happened quickly enough
    nova was left with one qemu-nbd hanging and blocking the device forever.

    Fixes bug 1207422

    Change-Id: I18864b19ebd30669534e45ca7a50e12f61207302
    (cherry picked from commit e4ed9e3726b87e4f37138867f752b0588f3626a0)

Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → 2013.2
Sean Dague (sdague)
no longer affects: nova/folsom
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.