udevd: timeout: killing 'watershed sh -c '/sbin/lvm vgscan; /sbin/lvm vgchange -a y''

Bug #995645 reported by Brian J. Murrell
80
This bug affects 16 people
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Boot takes a long time during which the console reports:

[ 100.224928] device-mapper: table: 252:40: snapshot: Snapshot cow pairing for exception table handover failed
udevd[617]: timeout: killing 'watershed sh -c '/sbin/lvm vgscan; /sbin/lvm vgchange -a y'' [1148]

udevd[617]: 'watershed sh -c '/sbin/lvm vgscan; /sbin/lvm vgchange -a y'' [1148] terminated by signal 9 (Killed)

Then the boot stalls waiting to mount /usr and I have to use M to manually recover.

When I get into the recovery shell, a read from /dev/rootvol_tmp/ubuntu_usr returns 0 bytes and I have to run "vgchange -a y" before I can get a read from that LV to return data. At that point I can drop out of the recovery shell and boot will continue.

I have the following LVs on this system:

# lvs
  LV VG Attr LSize Origin Snap% Move Log Copy% Convert
  video datavol -wn-ao 698.19g
  video_tmp datavol -wi-a- 233.32g
  apt_archives rootvol_tmp -wi-a- 1.00g
  home rootvol_tmp -wi-ao 1.78g
  lucid rootvol_tmp -wi-a- 3.00g
  lucid_root rootvol_tmp swi-a- 1.00g ubuntu_root 34.70
  lucid_usr rootvol_tmp swi-a- 5.00g ubuntu_usr 42.22
  lucid_var rootvol_tmp swi-a- 4.00g ubuntu_var 39.57
  myth-0.24_root rootvol_tmp swi-a- 1.00g ubuntu_root 34.66
  myth-0.24_usr rootvol_tmp swi-a- 5.00g ubuntu_usr 35.68
  natty_root rootvol_tmp swi-a- 500.00m ubuntu_root 49.99
  natty_usr rootvol_tmp swi-a- 2.49g ubuntu_usr 58.34
  swap rootvol_tmp -wi-ao 512.00m
  ubuntu_root rootvol_tmp owi-ao 500.00m
  ubuntu_root-20111213 rootvol_tmp swi-a- 500.00m ubuntu_root 70.96
  ubuntu_root-20120209 rootvol_tmp swi-a- 500.00m ubuntu_root 70.95
  ubuntu_root-20120214 rootvol_tmp swi-a- 500.00m ubuntu_root 70.94
  ubuntu_root-20120220 rootvol_tmp swi-a- 500.00m ubuntu_root 70.93
  ubuntu_root-20120313 rootvol_tmp swi-a- 500.00m ubuntu_root 70.93
  ubuntu_root-20120319 rootvol_tmp swi-a- 500.00m ubuntu_root 70.90
  ubuntu_root-20120412 rootvol_tmp swi-a- 500.00m ubuntu_root 50.03
  ubuntu_usr rootvol_tmp owi-ao 2.49g
  ubuntu_usr-20111213 rootvol_tmp swi-a- 2.49g ubuntu_usr 69.04
  ubuntu_usr-20120209 rootvol_tmp swi-a- 2.49g ubuntu_usr 68.64
  ubuntu_usr-20120214 rootvol_tmp swi-a- 2.49g ubuntu_usr 67.79
  ubuntu_usr-20120220 rootvol_tmp swi-a- 2.49g ubuntu_usr 67.52
  ubuntu_usr-20120313 rootvol_tmp swi-a- 2.49g ubuntu_usr 67.41
  ubuntu_usr-20120319 rootvol_tmp swi-a- 2.49g ubuntu_usr 66.21
  ubuntu_usr-20120412 rootvol_tmp swi-a- 2.49g ubuntu_usr 65.69
  ubuntu_var rootvol_tmp owi-ao 1.99g

That's probably more than most people, but certainly not enough to warrant an unbootable system I should not think.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: udev 175-0ubuntu9
ProcVersionSignature: Ubuntu 3.2.0-24.37-generic 3.2.14
Uname: Linux 3.2.0-24-generic i686
ApportVersion: 2.0.1-0ubuntu7
Architecture: i386
CurrentDmesg:
 [ 237.755758] NFSD: starting 90-second grace period
 [ 256.877022] SysRq : Changing Loglevel
 [ 256.880020] Loglevel set to 8
CustomUdevRuleFiles: 85-alsa.rules 85-ifupdown.rules 85-mdadm.rules 41-mythtv-permissions.rules
Date: Sun May 6 17:02:19 2012
MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic root=/dev/mapper/rootvol_tmp-ubuntu_root ro console=ttyS0,115200 console=tty0 resume=/dev/mapper/rootvol-swap crashkernel=384M-2G:64M,2G-:128M vmalloc=256M
SourcePackage: udev
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 03/26/2004
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080009
dmi.board.name: P4P800SE
dmi.board.vendor: ASUSTeK Computer Inc.
dmi.board.version: Rev 2.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr080009:bd03/26/2004:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASUSTeKComputerInc.:rnP4P800SE:rvrRev2.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: To Be Filled By O.E.M.
dmi.product.version: To Be Filled By O.E.M.
dmi.sys.vendor: To Be Filled By O.E.M.

Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in udev (Ubuntu):
status: New → Confirmed
Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :

Not even a triage on this? This happens on *every* *single* boot, so I have plenty of opportunity to provide additional debugging information if needed.

Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :

And still, a month and a half later, not even a triage of this bug.

I guess it might be time to move on to a new distro. Ubuntu seem like they have better things to do than fix bugs that seem to continue on from one release to another.

In any case, if I remove all of my LVM snapshots this problem goes away.

Revision history for this message
glance (glance-acc) wrote :

This bug affects me to after adding some snapshots to my vg of virtual machines.

My root vg doesn't contain any snapshots, just the vg that backs my VM's.

Revision history for this message
Kannan Manickam (kannan) wrote :

This bug affects me as well. I have this problem a lot when I work with lvm commands. The process can't be killed and it takes up all my CPU.

Revision history for this message
Bertrand Mathieu (bmat) wrote :

as of lvm2 2.02.98-6ubuntu1 the bug is still here. It may appear only if you have snapshots LVs.

It seems to be caused by this udev rule: /lib/udev/rules.d/85-lvm2.rules

It should be noted that this file is not present in Debian's lvm2 2.02.98-6.

This persons says that commenting this rule removed the problem:
http://www.thomas-krenn.com/en/wiki/HA_Cluster_with_Linux_Containers_based_on_Heartbeat,_Pacemaker,_DRBD_and_LXC#Device_Mapper_error_when_activating_LVM_volume_group_with_snapshots

And here is an interesting discussion that talks about "Snapshot cow pairing for exception table handover failed " at boot, lvm2 and Debian udev rules (that have changed since, I guess):
http://www.redhat.com/archives/linux-lvm/2012-February/msg00043.html

Revision history for this message
Bertrand Mathieu (bmat) wrote :

guilty udev rules seems to be in lvm2 package

affects: udev (Ubuntu) → lvm2 (Ubuntu)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

What is the contents of /etc/fstab ?

Revision history for this message
Heiko Helmle (hel) wrote :

I have pretty much the same problem. Once a snapshot is on the volume, the system fails to boot.

the problem is exactly as described here: http://www.redhat.com/archives/linux-lvm/2012-February/msg00028.html and the people on the mailing list suggest broken udev rules.

fstab is pretty normal:

# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc nodev,noexec,nosuid 0 0
/dev/mapper/helpc--13-system / btrfs defaults,subvol=@,compress,discard,space_cache,inode_cache 0 1
# /boot was on /dev/sdb2 during installation
UUID=084c2f84-902f-4fa9-aadd-e7c4e19cd6bb /boot ext3 defaults 0 2
/dev/mapper/helpc--13-system /home btrfs defaults,subvol=@home,compress,discard,space_cache,inode_cache 0 2
/dev/mapper/helpc--13-swap none swap sw 0 0
/dev/mapper/bigstor-vms /home/hel/vms ext4 defaults 0 2
/dev/mapper/bigstor-archive /home/hel/archive ext4 defaults 0 2

Revision history for this message
Seth (bugs-sehe) wrote :

@Brian Funny. You think three months to triage is long? Well, I reported this very bug in April 2009... https://bugs.launchpad.net/lvm2/+bug/360237

Your move :)

Revision history for this message
TheDreamer (thedreamer-lhaven) wrote :

Wow, don't know which is sadder...that the age of this problem, or that it took me this long to finally do a search on this problem.

Instead I've been trying to allow for reboots after unattended upgrades that require one, but to wait until snapshots are removed (they get created by my backup system.)

When I first started search, had found a similar sounding bug from back in 2006.... or that this is now a duplicate of a newer bug, which at seems to finally be getting attention?

But, at least answers the question on whether upgrading from 12.04 to 14.04 would fix the problem... though I hear the cookbooks are close to creating production 14.04 instances here...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.