lvscan often times out during boot

Bug #833891 reported by Serge Hallyn on 2011-08-25
34
This bug affects 4 people
Affects Status Importance Assigned to Milestone
udev (Ubuntu)
High
Unassigned
Oneiric
High
Unassigned

Bug Description

I'm not 100% certain that mountall is the problem, but seems a good place to start.

I have some LVM partitions on my laptop, and some filesystems on there are mounted through fstab:

/dev/schroots/schroots /srv/devel ext3 defaults,relatime 0 0
/dev/schroots/isos /iso ext4 defaults,relatime 0 0

Often (1/3 of the time?) boot hangs, and after a minute or so I get a message saying something about watershed and lvscan timing out. Then it proceeds to mount everything readonly.

Other times boot proceeds with no problem.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: mountall 2.31
ProcVersionSignature: Ubuntu 3.0.0-9.12-generic 3.0.3
Uname: Linux 3.0.0-9-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
Date: Thu Aug 25 10:56:52 2011
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Alpha amd64 (20110627.1)
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: mountall
UpgradeStatus: No upgrade log present (probably fresh install)

Serge Hallyn (serge-hallyn) wrote :
Serge Hallyn (serge-hallyn) wrote :

I assume this is due to

/lib/udev/rules.d/85-lvm2.rules

affects: mountall (Ubuntu) → udev (Ubuntu)
Changed in udev (Ubuntu):
importance: Undecided → High
Kees Cook (kees) wrote :

Can you include a full dump of /proc/partitions, /proc/mdstat, "vgs" output, and "lvs" output?

(and FWIW, I have a full LVM-on-md setup and have never encountered this.)

/proc/partitions:

major minor #blocks name

   8 0 312571224 sda
   8 1 29295616 sda1
   8 2 29296640 sda2
   8 3 29296640 sda3
   8 4 1 sda4
   8 5 78124032 sda5
   8 6 7811072 sda6
   8 7 138742784 sda7
 253 0 20971520 dm-0
 253 1 10485760 dm-1

/proc/mdstat:

Personalities :
unused devices: <none>

vgs output:
  VG #PV #LV #SN Attr VSize VFree
  schroots 1 2 0 wz--n- 132.31g 102.31g

lvs output:
  LV VG Attr LSize Origin Snap% Move Log Copy% Convert
  isos schroots -wi-ao 10.00g
  schroots schroots -wi-ao 20.00g

fstab:
# /etc/fstab: static file system information.
#
# Use 'blkid -o value -s UUID' to print the universally unique identifier
# for a device; this may be used with UUID= as a more robust way to name
# devices that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc nodev,noexec,nosuid 0 0
# / was on /dev/sda1 during installation
UUID=02d11fd6-e779-42ff-8779-f45177bb5fd2 / xfs defaults 0 1
# /home was on /dev/sda5 during installation
UUID=26218024-89b0-4c9a-844e-68eb3b3dd31e /home xfs defaults 0 2
# swap was on /dev/sda6 during installation
UUID=785ee8dc-a987-46df-92e6-32c685324d72 none swap sw 0 0
tmpfs /var/lib/schroot/union/overlay/ tmpfs defaults 0 0
/dev/schroots/schroots /srv/devel ext3 defaults,relatime 0 0
/dev/schroots/isos /iso ext4 defaults,relatime 0 0

Boris Devouge (bdevouge) wrote :

So the same deadlock issue was hit again:

root 1378 1 0 1031 560 0 17:28 ? 00:00:00 /lib/udev/watershed sh -c /sbin/lvm vgscan; /sbin/lvm vgchange -a y
root 1379 1378 0 1055 588 0 17:28 ? 00:00:00 sh -c /sbin/lvm vgscan; /sbin/lvm vgchange -a y
root 1381 1379 0 8719 14080 0 17:28 ? 00:00:00 /sbin/lvm vgchange -a y
root 1387 1 0 1031 324 0 17:28 ? 00:00:00 /lib/udev/watershed sh -c /sbin/lvm vgscan; /sbin/lvm vgchange -a y

There is clearly 2 incarnations of the udev rule being triggered.

The vgchange command took the LVM lock and waits for some udev event:
(gdb) bt
#0 0x00007fd37e8750f7 in semop () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fd37ed3863c in dm_udev_wait () from /lib/libdevmapper.so.1.02.1
#2 0x0000000000476d8b in ?? ()
#3 0x0000000000477a9e in dev_manager_activate ()
#4 0x000000000042f15c in ?? ()
#5 0x000000000043069b in ?? ()
#6 0x00000000004311ee in lv_activate_with_filter ()
#7 0x000000000047c5c1 in ?? ()
#8 0x000000000044e8a8 in ?? ()
#9 0x000000000044f4a6 in lock_vol ()
#10 0x0000000000428103 in ?? ()
#11 0x0000000000424036 in ?? ()
#12 0x0000000000424ddd in process_each_vg ()
#13 0x000000000041acd9 in lvm_run_command ()
#14 0x000000000041b4d4 in lvm2_main ()
#15 0x00007fd37e7abeff in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x000000000040f1a9 in _start ()

Apparently the udev event does not arrive, because the second rule instance also triggered vgscan/vgchange, and those cannot proceed because of the flock() taken by the first vgchange.

Any views? What version of liblvm do we plan to ship in 11.10 Oneiric? (2.02.66 seems dated)

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in udev (Ubuntu):
status: New → Confirmed
Serge Hallyn (serge-hallyn) wrote :

I would not be surprised if this actually was a side effect of bug 818177.

Serge Hallyn (serge-hallyn) wrote :

I took a look a the newest LVM package. It has a udev rules file, 11-dm-lvm.rules, which looks like it should do fine to create the symlinks for valid lvs. Does vgchange -ay do more than that? Here is the file:

# Copyright (C) 2009 Red Hat, Inc. All rights reserved.
#
# This file is part of LVM2.

# Udev rules for LVM.
#
# These rules create symlinks for LVM logical volumes in
# /dev/VG directory (VG is an actual VG name). Some udev
# environment variables are set (they can be used in later
# rules as well):
# DM_LV_NAME - logical volume name
# DM_VG_NAME - volume group name
# DM_LV_LAYER - logical volume layer (blank if not set)

# "add" event is processed on coldplug only!
ACTION!="add|change", GOTO="lvm_end"
ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="lvm_end"
ENV{DM_UUID}!="LVM-?*", GOTO="lvm_end"

# Use DM name and split it up into its VG/LV/layer constituents.
IMPORT{program}="$env{DM_SBIN_PATH}/dmsetup splitname --nameprefixes --noheadings --rows $env{DM_NAME}"

ENV{DM_UDEV_DISABLE_SUBSYSTEM_RULES_FLAG}=="1", GOTO="lvm_end"

# Do not create symlinks for inappropriate subdevices.
ENV{DM_LV_NAME}=="pvmove?*|?*_vorigin", GOTO="lvm_disable"
ENV{DM_LV_LAYER}=="?*", GOTO="lvm_disable"

# Create symlinks for top-level devices only.
ENV{DM_VG_NAME}=="?*", ENV{DM_LV_NAME}=="?*", SYMLINK+="$env{DM_VG_NAME}/$env{DM_LV_NAME}", GOTO="lvm_end"

LABEL="lvm_disable"
ENV{DM_UDEV_DISABLE_DISK_RULES_FLAG}="1"
ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
OPTIONS:="nowatch"

LABEL="lvm_end"

Serge Hallyn (serge-hallyn) wrote :

For the record, that file alone doesn't seem to work (and the rest conflict with files from md)

Alex Lyakas (alyakas) wrote :

Hi Serge,
the main thing that lvchange -ay does is activating the LVs within VGs.

After uninstalling the ubuntu LVM package, and installing an official one
(without ubuntu's udev rule), we had to add a call to lvm_lv_activate() for
each LV. So when an existing VG comes up, it is required to go over its LVs
and activate them. But in our case, we don't have LVM on boot drives; and I
am not sure how to trigger LVM LV activation in this case.

Activating the LVs (that ubuntu's rule does) is perhaps not a bad thing. But
the problem is that this is done within the context of udev event
processing, which may cause the udev event processing to get stuck.

BTW, if what I am saying doesn't make sense, that's fine. I am just
reporting the issues that we have experienced.

Thanks,
  Alex.

On Fri, Sep 23, 2011 at 6:00 PM, Serge Hallyn <email address hidden>wrote:

> I took a look a the newest LVM package. It has a udev rules file, 11
> -dm-lvm.rules, which looks like it should do fine to create the symlinks
> for valid lvs. Does vgchange -ay do more than that? Here is the file:
>
> # Copyright (C) 2009 Red Hat, Inc. All rights reserved.
> #
> # This file is part of LVM2.
>
> # Udev rules for LVM.
> #
> # These rules create symlinks for LVM logical volumes in
> # /dev/VG directory (VG is an actual VG name). Some udev
> # environment variables are set (they can be used in later
> # rules as well):
> # DM_LV_NAME - logical volume name
> # DM_VG_NAME - volume group name
> # DM_LV_LAYER - logical volume layer (blank if not set)
>
> # "add" event is processed on coldplug only!
> ACTION!="add|change", GOTO="lvm_end"
> ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="lvm_end"
> ENV{DM_UUID}!="LVM-?*", GOTO="lvm_end"
>
> # Use DM name and split it up into its VG/LV/layer constituents.
> IMPORT{program}="$env{DM_SBIN_PATH}/dmsetup splitname --nameprefixes
> --noheadings --rows $env{DM_NAME}"
>
> ENV{DM_UDEV_DISABLE_SUBSYSTEM_RULES_FLAG}=="1", GOTO="lvm_end"
>
> # Do not create symlinks for inappropriate subdevices.
> ENV{DM_LV_NAME}=="pvmove?*|?*_vorigin", GOTO="lvm_disable"
> ENV{DM_LV_LAYER}=="?*", GOTO="lvm_disable"
>
> # Create symlinks for top-level devices only.
> ENV{DM_VG_NAME}=="?*", ENV{DM_LV_NAME}=="?*",
> SYMLINK+="$env{DM_VG_NAME}/$env{DM_LV_NAME}", GOTO="lvm_end"
>
> LABEL="lvm_disable"
> ENV{DM_UDEV_DISABLE_DISK_RULES_FLAG}="1"
> ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}="1"
> OPTIONS:="nowatch"
>
> LABEL="lvm_end"
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/833891
>
> Title:
> lvscan often times out during boot
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/udev/+bug/833891/+subscriptions
>

Serge Hallyn (serge-hallyn) wrote :

Quoting Alex Lyakas (<email address hidden>):

Thanks for the information, Alex.

> Activating the LVs (that ubuntu's rule does) is perhaps not a bad thing. But
> the problem is that this is done within the context of udev event
> processing, which may cause the udev event processing to get stuck.

So I wonder if it would suffice to fire off the vgchange call asynchronously,
so that vgchange can then spin on the lock while the udev rule proceeds to
exit. Then vgchange can proceed when udev drops the lv lock.

Does that make any sense? It might mess up the use off watershed though.

Kees Cook (kees) wrote :

On Fri, Sep 23, 2011 at 11:23:33PM -0000, Serge Hallyn wrote:
> Does that make any sense? It might mess up the use off watershed
> though.

watershed is there to keep multiple invocations of effectively the same
event in check. For example, each time a pv appears, we attempt to activate
all VGs. But if all the drives suddenly appear at once, we only need
to attempt the activation once. (See the extensive comments at the start of
watershed.c in the watershed package.)

It might be interesting to see if races are solved (or introduced) by
having watershed itself offer a "daemonize" option to put itself in the
background and immediately return to the caller (udev). This may solve it.

On the other hand, perhaps the solution is to fix up the udev rule to only
apply to PVs, and nothing else, as it certainly wasn't designed to run when
creating an LV.

Serge Hallyn (serge-hallyn) wrote :

Quoting Kees Cook (<email address hidden>):
> On Fri, Sep 23, 2011 at 11:23:33PM -0000, Serge Hallyn wrote:
> > Does that make any sense? It might mess up the use off watershed
> > though.
>
> watershed is there to keep multiple invocations of effectively the same
> event in check. For example, each time a pv appears, we attempt to activate
> all VGs. But if all the drives suddenly appear at once, we only need
> to attempt the activation once. (See the extensive comments at the start of
> watershed.c in the watershed package.)
>
> It might be interesting to see if races are solved (or introduced) by
> having watershed itself offer a "daemonize" option to put itself in the
> background and immediately return to the caller (udev). This may solve it.

I'll try that out second, but

> On the other hand, perhaps the solution is to fix up the udev rule to only
> apply to PVs, and nothing else, as it certainly wasn't designed to run when
> creating an LV.

Ah, thanks, I didn't realize that. I can try that out.

thanks,
-serge

Serge Hallyn (serge-hallyn) wrote :

Quoting Kees Cook (<email address hidden>):
> It might be interesting to see if races are solved (or introduced) by
> having watershed itself offer a "daemonize" option to put itself in the
> background and immediately return to the caller (udev). This may solve it.

Kees, do you think it will introduce problems if the vgchange returns
an error but, now that watershed daemonizes and does not propagate
vgchange's exit status, the udev rule doesn't see it?

It seems to me there's no graceful recovery from that anyway so it
shouldn't really matter, but clearly I could be missing something.

thanks,
-serge

Kees Cook (kees) wrote :

udev doesn't do anything with the RUN exit code, so I suspect that'll be fine. Note that daemonizing needs to
be written still. Did detecting PVs not work?

Serge Hallyn (serge-hallyn) wrote :

Near as I can tell from my udev logs, the lvm2 rule should in fact only be firing on PVs. With an LV, the ID_FS_TYPE=ext4, not LVM*.

I just tried a trivial update to watershed to allow it to daemonize (after doing the fork), and tweaking the udev rule to call watershed with -D. But my test VM still hung after 4 reboots. So that seems insufficient.

Although it occurs to me I've also had non-lvm-related (but udev-caused) hangs pretty frequently on these VMs, I'll have to check a bit more to make sure I didn't just now experience one of those.

Serge Hallyn (serge-hallyn) wrote :

So far the following two debdiffs together (for lvm2 and watershed) are working on my laptop. It's possible I just haven't run into the race, but it had been happening pretty frequently before. (if it is decided that this works and is acceptable, then before committing code would need to be better documented, and LP: added to changelog, of course)

Serge Hallyn (serge-hallyn) wrote :
Serge Hallyn (serge-hallyn) wrote :

The attachment "lvm2.debdiff" of this bug report has been identified as being a patch in the form of a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Serge Hallyn (serge-hallyn) wrote :

So far with the new packages I've not gotten any more hangs. seems to be working for me.

Kees, could you test whether your system boots ok with these?

I won't be able to test until the weekend, unfortunately. But then, yes. :)

Serge Hallyn (serge-hallyn) wrote :

Still no recurrence of this bug with the debdiffs applied.

Could someone please have a look at what I have written in Comment 7 of Bug #802626 and check if it might apply to this bug?

Serge Hallyn (serge-hallyn) wrote :

@Kees,

any luck?

@Eduard,

what you describe isn't impossible, but also seems to be explained by comment #5 in this bug. Moreover, I've not seen this bug since using the debdiff attached to this bug. Can you try this debdiff and say whether that solves your hang?

So far, I'm a bit hesitant to test the patch. I assume that the patch will work because the "mapper-task-completion-event" will just be processed by the 2nd udevd. There would only be a small time window of(~1ms) during exit of the 1st udevd, where the completion event might be lost, causing the scan to hang. And due to deamonizing, this would probably go unnoticed for the whole system uptime.

Actually your pointer to bug 818177 is relevant. In comment 19 steve is proposing a solution for the problem I described.

Serge Hallyn (serge-hallyn) wrote :

@Eduard (& @slangasek),

hm, when I looked before I thought it was continuing to process already-queued messages from the kernel, but looking again, holy cow - it continues to *add* events to the queue once from the kernel netlink socket, but then doesn't process them!

Steve Langasek (vorlon) on 2011-10-06
Changed in udev (Ubuntu Oneiric):
milestone: none → oneiric-updates
Steve Langasek (vorlon) wrote :

I don't see that daemonizing watershed gets to the root of the issue. If Eduard's analysis on bug #802626 is correct, backgrounding the child process doesn't change the fundamental race, which is that udev in the initramfs might die while vgscan is still holding a lock and waiting for the follow-up event.

I'm going to mark bug #802626 as the master bug for this issue.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers