LVM hang in volume creation

Bug #1366125 reported by Duncan Thomas
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Won't Fix
High
Unassigned

Bug Description

Sometimes, after running a bunch of cinder create-volumes in parallel (anecdotally this is easier to reproduce on a clear system), we see a bunch of stuck LVM processes, e.g.:

root@overcloud-controllermgmt0-ws55j3shqblj:/var/log/cinder# ps -Ao pid,tt,user,fname,tmout,f,wchan | grep lv
 7601 ? root lvcreate - 4 flock_lock_file_wait
10533 ? root lvcreate - 4 flock_lock_file_wait
13098 ? root lvcreate - 4 flock_lock_file_wait
13505 ? root lvcreate - 4 flock_lock_file_wait
13552 ? root lvcreate - 4 flock_lock_file_wait
16955 ? root lvs - 0 flock_lock_file_wait
17227 ? root lvcreate - 4 flock_lock_file_wait
19629 ? root lvcreate - 4 flock_lock_file_wait
22434 ? root lvs - 0 flock_lock_file_wait
23525 ? root lvcreate - 4 SYSC_semtimedop
23732 ? root lvcreate - 4 flock_lock_file_wait
27043 ? root lvcreate - 4 flock_lock_file_wait
29726 ? root lvcreate - 4 flock_lock_file_wait
29858 ? root lvs - 0 flock_lock_file_wait

One is waiting for the semaphore to be decremented, the others are queued up behind the flock held by that process.

Killing the process waiting for the semaphore, or running "dmsetup udevcomplete_all", causes the system to get unwedged

Revision history for this message
John Griffith (john-griffith) wrote :
Changed in cinder:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
John Griffith (john-griffith) wrote :
Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

As a work around, set the following in /etc/lvm/lvm.conf:

activation {
    # Set to 0 to disable udev synchronisation (if compiled into the binaries).
    # Processes will not wait for notification from udev.
    # They will continue irrespective of any possible udev processing
    # in the background. You should only use this if udev is not running
    # or has rules that ignore the devices LVM2 creates.
    # The command line argument --nodevsync takes precedence over this setting.
    # If set to 1 when udev is not running, and there are LVM2 processes
    # waiting for udev, run 'dmsetup udevcomplete_all' manually to wake them up.
    udev_sync = 0

    # Set to 0 to disable the udev rules installed by LVM2 (if built with
    # --enable-udev_rules). LVM2 will then manage the /dev nodes and symlinks
    # for active logical volumes directly itself.
    # N.B. Manual intervention may be required if this setting is changed
    # while any logical volumes are active.
    udev_rules = 0
}

Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Acutally this looks like a different hang entirely, the one I saw does not appear in syslog, it's a userspace deadlock

Revision history for this message
Ivan Kolodyazhny (e0ne) wrote :

Duncan, what OS do you use?

I've seen something like it in Ubuntu with lvm2 package v.2.0.66 (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=659762 - related issue in Debian). Custom build of lvm2 v2.0.111 (backport from Ubuntu15.04) resolved issue for me.

Revision history for this message
Szymon Wróblewski (bluex) wrote :

Cannot reproduce on latest DevStack. Tested with Rally - creating 20 volumes with 10 threads.

Changed in cinder:
status: Confirmed → Incomplete
Albert Syriy (asyriy)
Changed in cinder:
assignee: nobody → Albert Syriy (asyriy)
Albert Syriy (asyriy)
Changed in cinder:
assignee: Albert Syriy (asyriy) → nobody
Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Confirmed that this is fixed with later versions of Ubuntu. NOt sure if the fix was ever backported to 14.04 though. Thanks Ivan.

Changed in cinder:
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.