lvremove fails

Bug #533493 reported by K Richard Pixley on 2010-03-06
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
lvm2 (Debian)
Fix Released
Unknown
lvm2 (Fedora)
Fix Released
High
lvm2 (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: lvm2

Create 2 logical volumes on the same volume group.

On each of these logical volumes, create one snapshot each.

Now attempt to remove them all using something like "lvremove -f vg".

I expect them all to be removed. What I see instead is:

(v4c-dev)rich@adriatic> sudo lvremove -f v4c
  Logical volume "ss-test.0" successfully removed
  Logical volume "lv-test.0" successfully removed
  Logical volume "ss-test.1" successfully removed
  Logical volume "lv-test.1" successfully removed
  Logical volume "ss-test.2" successfully removed
  LV v4c/lv-test.2 in use: not deactivating
  Unable to deactivate logical volume "lv-test.2"
(v4c-dev)rich@adriatic> sudo lvs --noheadings -o name v4c
  lv-test.2
(v4c-dev)rich@adriatic> sudo lvremove -f v4c
  Logical volume "lv-test.2" successfully removed

Note that there are no dependents of lv-test.2 at that time, that the other 2 logical volumes with snapshots were successfully removed, and that a successive lvremove does indeed remove the remaining logical volume.

(v4c-dev)rich@adriatic> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu lucid (development branch)
Release: 10.04
Codename: lucid
(v4c-dev)rich@adriatic> apt-cache policy lvm2
lvm2:
  Installed: 2.02.54-1ubuntu3
  Candidate: 2.02.54-1ubuntu3
  Version table:
 *** 2.02.54-1ubuntu3 0
        500 http://ubunturep.palm.com lucid/main Packages
        500 http://us.archive.ubuntu.com lucid/main Packages
        100 /var/lib/dpkg/status
(v4c-dev)rich@adriatic> dpkg -l | grep lvm2
ii lvm2 2.02.54-1ubuntu3 The Linux Logical Volume Manager

ProblemType: Bug
Architecture: amd64
Date: Sat Mar 6 13:26:10 2010
DistroRelease: Ubuntu 10.04
Package: lvm2 2.02.54-1ubuntu3
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-15.22-server
SourcePackage: lvm2
Uname: Linux 2.6.32-15-server x86_64

K Richard Pixley (rich-noir) wrote :
Zdenek Kabelac (zdenek-kabelac) wrote :

Looks like a 'famous' udev issue problem - currently it's still being resolved. There is no clear way, how to prevent 'random' device open from udev rules. 'watch' rule is usually the main source of troubles.

udev rules needs to be changed...

EAB (erwin-true) wrote :

Ubuntu 10.04.1 LTS
Kerlen 2.6.32-25-server and 2.6.32-24-server x86_64
lvm2 2.02.54-1ubuntu4.1
udev 151-12.2

libvirt-bin 0.7.5-5ubuntu27.7
kvm 1:84+dfsg-0ubuntu16+0.12.3+noroms+0ubun

We're running VM's with KVM on several hosts. We're using a LV per VM.
Every night the LVM-volumes are snapshotted one-by-one. After saving the snapshotted volume to another server we need to remove the snapshot.

/sbin/lvremove -f /dev/someVG/snap-VMvolumeXYZ hangs sometimes.
On the 4 hosts we're using LVM this is the second time in 4 days.
It's impossible to kill this process and all other LVM-commands are freezing to from this moment.

I have to remove /dev/mapper/someVG-VMvolumeXYZ manualy to be able to use other LVM-commands again.

It seems that the devices is SUSPENDED.

# dmsetup info someVG-VMvolumeXYZ
Name: someVG-VMvolumeXYZ
State: SUSPENDED
/dev/mapper/hl-someVG-VMvolumeXYZ: open failed: No such file or directory
Tables present: None
Open count: 2
Event number: 0
Major, minor: 251, 9
Number of targets: 0
UUID: LVM-R6AybI9pE2adk8jBZuRc837oZl9Kh2k3p8WzNdpuyQT7zb1xfFb0pJ3CbdkNyx4K

Is there a way to remove this lock or suspended state? Rebooting the host will solve this probably, but thats not an option twice a week. There are 15 to 30 VM's running on these hosts.

Using a search-engine on this matters gives a lot of results going back to 2006. Seems to be exactly the same.
Link: http://readlist.com/lists/redhat.com/linux-lvm/0/422.html

EAB (erwin-true) wrote :

I came across a PDF-document from HP. A recent white-paper on LVM snapshots.

http://h20000.www2.hp.com/bizsupport/TechSupport/CoreRedirect.jsp?redirectReason=DocIndexPDF&prodSeriesId=4296010&targetPage=http%3A%2F%2Fbizsupport2.austin.hp.com%2Fbc%2Fdocs%2Fsupport%2FSupportManual%2Fc02054539%2Fc02054539.pdf

Most interesting part:
"In very low system memory conditions, deletion of a single snapshot can hang indefinitely for memory to become available. Ensure that sufficient memory is available during deletion of a single snapshot that requires data to be copied to its predecessor. If the lvremove command hangs in these cases, increase the system memory or free some existing system memory to proceed with the snapshot deletion."

No further explaination is give....

Our host contains 64GB RAM and 2 6-core Intel CPU's.
We're using Munin to graph memory-usage. The graphs are updated every 5 minutes, so we don't have a real numbers on usage on the moment the snapshot was removed.
At the moment the removal of the snapshot was initiated the host used approximately 51GB RAM, 6GB buffers, 10GB unused and 3GB swap.

I'm thinking about some NUMA-issues I researched last weeks. It's probably nothing to do with this issue.

Some memory-statistics:
# free -m
             total used free shared buffers cached
Mem: 64549 64062 487 0 23579 780
-/+ buffers/cache: 39702 24847
Swap: 7627 377 7250

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 32768 MB
node 0 free: 63 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 32758 MB
node 1 free: 437 MB
node distances:
node 0 1
  0: 10 20

The host is swapping a little now, but every day it swaps out 4GB of RAM. vm.swappiness=0
swapoff -a && swapon -a is run every day a couple times.
It should not swap, but it seems to be an issue with multiple CPU sockets and processes not using the same NUMA-node (CPU-pinning). It seems that hosts with multiple sockets (not cores) swaps out a lot more.

It could be possible that the lvremove action thinks there is not enough ram and hangs indefinitely.

Hopefully someone can confirm some of this.

EAB (erwin-true) wrote :

ah, btw a accidental reboot fixed the locked LV.

Download full text (15.0 KiB)

+++ This bug was initially created as a clone of Bug #712100 +++

+++ This bug was initially created as a clone of Bug #577798 +++

Description of problem:

# lvdisplay

  --- Logical volume ---
  LV Name /dev/vg01/.local.backup
  VG Name vg01
  LV UUID hkAyO4-M31g-LJw5-Kdcu-AfK1-Bquw-buVrWA
  LV Write Access read only
  LV snapshot status active destination for /dev/vg01/local
  LV Status available
  # open 0
  LV Size 2,00 GiB
  Current LE 512
  COW-table size 1,00 GiB
  COW-table LE 256
  Allocated to snapshot 0,01%
  Snapshot chunk size 4,00 KiB
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:29

# /dev/vg01/.local.backup
  Can't remove open logical volume ".local.backup"

  ... repeating this several times ...

# lvremove /dev/vg01/.local.backup
Do you really want to remove active logical volume .local.backup? [y/n]: y
  Logical volume ".local.backup" successfully removed

Version-Release number of selected component (if applicable):

kernel-2.6.33.1-19.fc13.x86_64
lvm2-2.02.61-1.fc13.x86_64

--- Additional comment from <email address hidden> on 2010-03-29 05:52:00 EDT ---

Created attachment 403250
strace in non-working case

--- Additional comment from <email address hidden> on 2010-03-29 05:52:28 EDT ---

Created attachment 403251
strace in working case

--- Additional comment from <email address hidden> on 2010-03-29 06:08:36 EDT ---

Do you have "udisks" package installed? There is one udev rule that could have possibly caused this...

For starters, just a quick check - could you please try to kill udev daemon temporarily and see if you can reproduce the problem? Thanks.

--- Additional comment from <email address hidden> on 2010-03-30 07:36:02 EDT ---

yes; udisks is installed and I can not reproduce the issue after its removal.

'udevadm control --stop-exec-queue' before lvremove seems to work too.

--- Additional comment from <email address hidden> on 2010-03-30 08:08:22 EDT ---

Just for the record, the rule we have problem supporting is this one exactly (in /lib/udev/rules.d/80-udisks.rules which is a part of udisks package):

# Make udevd synthesize a 'change' uevent when last opener of a rw-fd closes the fd - this
# should be part of the device-mapper rules
KERNEL=="dm-*", OPTIONS+="watch"

We have added udev synchronisation feature in device-mapper/lvm2 recently so we always wait until udev processing is settled down to cope with such problems where devices are accessed from within udev rules and also to provide a way to wait for nodes/symlinks to be created.

However, we can't synchronize with events synthesized as a result of this rule (like we can't with events originating in "udevadm trigger" which generates such events as well). The synchronisation could be done on events we know about (events originated in device-mapper itself).

There are still ongoing discussions with udev team to properly deal with this issue though...

Now, could you plea...

(In reply to comment #0)
> +++ This bug was initially created as a clone of Bug #712100 +++
>
> Still present in F15

The original report is already for F15. I assume you meant F16 so I'm changing the version.

There are patches upstream for this (a new "retry_deactivation = 1" option in lvm.conf). This appears in lvm2 v2.02.89. However, this version is stil not released since it includes a lot of other (and more important) changes that still require some review. We'd like to test all these changes in Fedora rawhide first to avoid any other regressions. Once this is tested in rawhide, we'll backport this patch for other Fedora releases. I'm sorry for any inconvenience.

(In reply to comment #1)
« snip - snip »
> There are patches upstream for this (a new "retry_deactivation = 1" option in
> lvm.conf). This appears in lvm2 v2.02.89. However, this version is stil not

« snip - snip »

Peter,
I presume that the "retry_deactivation" is a boolean parameter to "retry the volume deactivation process" and not a count of "maximum deactivate retries"? The parameter name *is* open to interpretation.

Can you point us in the direction of a "NEW_FEATURES" document of the other you-beaut stuff coming in the later versions of lvm2? (I've found the Wiki and other doco, but it doesn't seem current - last updated in 2009 &/or 2010).

Cheers!

It is nice to know what is coming soon. But how can I remove a volume under Fedora 16 in a reliable way?

(In reply to comment #3)
> It is nice to know what is coming soon. But how can I remove a volume under
> Fedora 16 in a reliable way?

For snapshot volumes, I've been performing the following which has worked reliably for me:

### I have to use "dmsetup remove" to deactivate the snapshots first
### Volume list for dmsetup looks like "vg-vol1 vg-vol2 vg-vol3" etc
### ie dmsetup uses hyphens to separate the VG component from the LV
for SNAPVOL in ${DM_VOLUME_LIST}; do
  printf "Deactivating snapshot volume %s\n" ${SNAPVOL}
  dmsetup remove ${SNAPVOL}
  dmsetup remove ${SNAPVOL}-cow
## for some reason, the copy-on-write devices aren't cleaned up auto-magically
## so I have to remove them auto-manually.
done

## Okay - now we can remove the snapshot logical volumes
## Volume list for lvremove looks like "vg/vol1 vg/vol2 vg/vol3" etc
## ie lvremove uses slashes to separate the VG component from the LV
lvremove -f ${LV_VOLUME_LIST}

The above is taken from a working script I use to snapshot cyrus-imap file systems (after quiescing cyrus first) so that they can be backed up and still let cyrus-imap operate.

I hope this gives you some ideas for your own needs.

I use

run_lvremove() {
    $DMSETUP remove "/dev/$1" || :
    $DMSETUP remove "/dev/$1-cow" 2>/dev/null || :

    /sbin/udevadm control --stop-exec-queue || :
    $LVM lvchange $NOUDEVSYNC --quiet -an "$1" || :
    /sbin/udevadm control --start-exec-queue || :

    $LVM lvremove --quiet -f "$1" && sleep 5
}

but still get

| Can't change snapshot logical volume ".nfs4.backup"
| LV vg01/.nfs4.backup in use: not deactivating
| Unable to deactivate logical volume ".nfs4.backup"

The leftover from this is not marked as a snapshot anymore:

# lvscan
  ACTIVE '/dev/vg01/nfs4' [4,00 MiB] inherit
  ACTIVE Original '/dev/vg01/data' [135,00 GiB] inherit
...
  ACTIVE '/dev/vg01/.nfs4.backup' [1,00 GiB] inherit
  ACTIVE '/dev/vg01/.virt.backup' [1,00 GiB] inherit
  ACTIVE Snapshot '/dev/vg01/.data.backup' [5,00 GiB] inherit

And somehow, subsequent 'mount' operations fail with obscure errors like

| mount: unknown filesystem type 'DM_snapshot_cow'

F16 rendered working with LVM nearly impossible :(

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lvm2 (Ubuntu):
status: New → Confirmed
Yongzhi Pan (fossilet) on 2012-03-19
no longer affects: lvm2 (Fedora)
Changed in lvm2 (Debian):
importance: Undecided → Unknown
status: New → Unknown
Changed in lvm2 (Debian):
status: Unknown → New

IMHO examples here are not valid - it can't be mixed lvm2 commands and dmsetup commands together they are incompatible in terms of udev synchronization.

So i.e. example in comment 5 isn't really good idea at all - what is this supposed to be doing ??

While saying this - I've a patch proposal for upstream inclusion - since current retry code only understands mounted ext4 and fuse filesystems and will not try retry mechanism for other filesystems - my patch proposal is a bit more smarter is it goes through /proc/mounts entries - but it's not smart enough yet I think - but should give much better results.

Created attachment 573282
Patch for scanning /proc/mounts

Patch proposal to check /proc/mounts entries to find out, whether device is mounted.

The file /proc/self/mountinfo contains the dev_t, it's generally safer to use.

Also be careful with stat(), it is a 'name' not a 'device'. It's not likely
to cause problems, but it's possible in theory. The name can contain anything,
it is copied verbatim from the mount() system call. It is recommended to use
lstat() here to avoid triggering the automounter in case you run over a
path name here.

I read this from thread creation when it was F13. None of the tequniques worked for me, running F16, even once.

If I just create the snapshot, I can remove it. But, if I just mount it for a second, and then unmount it, I cannot remove it. It does not appear in /proc/mounts after unmounting.

When I try dmsetup remove, I just get:

# dmsetup remove vg_corsair-ss
device-mapper: remove ioctl failed: Device or resource busy
Command failed

I have not seen anyone report this when they tried dmsetup.

uname -a:

Linux localhost 3.3.0-4.fc16.x86_64 #1 SMP Tue Mar 20 18:05:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

lvm2-2.02.86-6.fc16.x86_64
system-config-lvm-1.1.16-4.fc16.noarch
lvm2-libs-2.02.86-6.fc16.x86_64
llvm-libs-2.9-6.fc16.x86_64

Let me know if you need any more diagnostics from me.

With F16 I also saw the new "not deactivating" error and had to
work around it by repeating the lvremove if it failed the first
time. I put upto 5 retries on the lvremove in a script and so far
its only had to be repeated once, every few days.

I also have the
   KERNEL=="dm-*", OPTIONS+="watch"
line commented, in /lib/udev/rules.d/80-udisks.rules which I had to
add for F15 (or F14)...

I have a test machine running FC16 (x86_64), and I've run a number of snapshot tests without failure.

I'm going to attach a copy of my test script along with its output for people to review.

There have been no local changes to udev rules, ie the environment is essentially a bog-standard install, fully patched.

I will make one comment here, which may or may not have an impact.
I absolutely *HATE* gnome3 and have uninstalled the GNOME Desktop Environment because I find it inherently unusable.
There are some GNOME libraries for apps that need them, but the GNOME VFS package is NOT present. Whether this has an impact with udev etc, and due to the absence of GNOME VFS is allowing LVM to work correctly, I don't know.

Anyway, LVM & kernel packages for my system are shown here:

[root@central ~]# rpm -qa | egrep -ie '(lvm)|(kernel)' | sort -fu
abrt-addon-kerneloops-2.0.7-2.fc16.x86_64
kernel-3.2.10-3.fc16.x86_64
kernel-3.2.9-1.fc16.x86_64
kernel-3.2.9-2.fc16.x86_64
kernel-3.3.0-4.fc16.x86_64
kernel-3.3.0-8.fc16.x86_64
kernel-headers-3.3.0-8.fc16.x86_64
libreport-plugin-kerneloops-2.0.8-4.fc16.x86_64
llvm-libs-2.9-9.fc16.i686
llvm-libs-2.9-9.fc16.x86_64
lvm2-2.02.86-6.fc16.x86_64
lvm2-libs-2.02.86-6.fc16.x86_64

[root@central ~]# uname -a
Linux central.treetops 3.3.0-8.fc16.x86_64 #1 SMP Thu Mar 29 18:37:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Created attachment 574829
Quick & Dirty test snapshot script

Created attachment 574830
Output from the snapshot test script

Here is an extract from /var/log/messages showing the logged events pertaining to the last 4 runs of my script:

Apr 3 21:05:07 central kernel: [176887.592380] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:07 central kernel: [176887.592389] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:07 central kernel: [176887.592395] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:08 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:05:08 central kernel: [176888.183588] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:05:26 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:05:56 central kernel: [176936.489527] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central kernel: [176936.489536] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central kernel: [176936.489542] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:05:57 central kernel: [176937.026962] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:06:15 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:17:27 central kernel: [177627.082392] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central kernel: [177627.082401] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central kernel: [177627.082408] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:17:28 central kernel: [177627.966626] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:17:47 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr 3 21:17:51 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:18:21 central kernel: [177681.486416] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central kernel: [177681.486421] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central kernel: [177681.486424] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:18:22 central kernel: [177682.067903] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:18:42 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr 3 21:18:45 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog

Commenting out the rule does not help me at all. Retrying 1000 times does not help either. The only way I can remove the snapshot is by rebooting first.

Can't say I've used the dmsetup commands; have you tried using the lvm commands instead?

I normally only use the lvm commands. lvremove doesn't work without a reboot, either.

I use snapshots to create consistent backups, so my backup is a bit crippled right now on this computer. Works great on another computer where I launch from Ubuntu 10.

This is an excerpt of what I use for backups using lvm snapshots... works fine on FC16.

We retain a few GB of free space in the LVM storage for the snapshot,
which generally suffices on most systems unless there is a lot
of updates happening during backups, which is rare on the systems
we use this on.

--
DTYPE=$1
# NB: level contains level and 'u' option (eg: 0u)
LEVEL=$2
TAPE=$3
FS=$4
RHOST=$5

case "${TAPE}${RHOST}" in
--)
        # dumping to stdout
        RTAPE="-"
        ;;
*)
        # using rsh/rmt(8)
        RTAPE="${RHOST}:${TAPE}"
        ;;
esac

LVCREATE=""
if [ -x /sbin/lvcreate ]
then
        LVCREATE="/sbin/lvcreate"
        LVREMOVE="/sbin/lvremove"
        LVDISPLAY="/sbin/lvdisplay"
elif [ -x /usr/sbin/lvcreate ]
then
        LVCREATE="/usr/sbin/lvcreate"
        LVREMOVE="/usr/sbin/lvremove"
        LVDISPLAY="/usr/sbin/lvdisplay"
fi

if [ "`df $FS | grep /dev/mapper`" -a "$LVCREATE" != "" ] ; then
        DUMPDEV=`df $FS |grep mapper | cut -d/ -f4 | cut -d' ' -f1 | tr - /`
        VOL=`echo $DUMPDEV | cut -d/ -f1`
        SNAPVOL=`echo $DUMPDEV | cut -d / -f2`-snap
        SNAPVOL2=`echo $DUMPDEV | cut -d / -f2`--snap
        SNAPDEV=/dev/$VOL/$SNAPVOL
        SNAPRDEV=/dev/mapper/$VOL-$SNAPVOL2
        echo DUMPDEV=$DUMPDEV 1>&2
        echo VOL=$VOL 1>&2
        echo SNAPVOL=$SNAPVOL 1>&2
        echo SNAPVOL2=$SNAPVOL2 1>&2
        echo SNAPDEV=$SNAPDEV 1>&2
        echo SNAPRDEV=$SNAPRDEV 1>&2

        # cleanup from last backup
        $LVREMOVE -f $SNAPDEV >/dev/null 2>&1

        echo `date` starting snapshot 1>&2
        $LVCREATE -l 100%FREE -s -n $SNAPVOL /dev/$DUMPDEV 1>&2
        echo `date` starting backup 1>&2

        dump "${LEVEL}bfL" 32 "$RTAPE" "$FS" $SNAPRDEV

        # workaround FC16 bug; delay before clearing
        sleep 5

        echo `date` clearing snapshot 1>&2
        $LVREMOVE -f $SNAPDEV 1>&2

        # workaround FC16 bug; do it again if needed
        for i in 1 2 3 4 5
        do
                $LVDISPLAY | grep $SNAPDEV >/dev/null
                if [ $? = 0 ]
                then
                        # its still there!
                        sleep 5
                        echo `date` clearing snapshot again 1>&2
                        $LVREMOVE -f $SNAPDEV 1>&2
                else
                        break
                fi
        done
        $LVDISPLAY | grep $SNAPDEV >/dev/null
        if [ $? = 0 ]
        then
                echo `date` gave up clearing snapshot - manual intervention required 1>&2
                STATUS=1
        fi

exit $STATUS

I'm experiencing the same problem, looping the lvremove statement doesn't succeed, the snapshot can only be removed after a reboot. I notice also a similar problem with fsck on logical volumes - once the volume has been mounted once, and unmounted, it cannot be checked with fsck (e2fsck).

# fsck /dev/mapper/Backup-Home--backup
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/mapper/Backup-Home--backup
Filesystem mounted or opened exclusively by another program?

# mount | grep Backup | wc -l
0

Could this be a clue?

what happens if you use fsck -n (read-only open) ?

Also see bug 809188 - it seems there is some more generic bug, not related to lvm.

(In reply to comment #20)
> what happens if you use fsck -n (read-only open) ?

# fsck -n /dev/mapper/Backup-Home--backup
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/mapper/Backup-Home--backup
Filesystem mounted or opened exclusively by another program?

In reply to comment #21, ref bug 809188 I do use gnome-shell, but my backup script runs overnight when there usually isn't a user logged in.

Sorry, that should have been ref bug 808795.

I added comment #1 to bug 808795. It may be of interest to followers of this bug.

Download full text (14.6 KiB)

I can't see the problem on my FC16 servers; all of which use
md mirroring and lvm2.

However they were all upgraded via yum distro-sync from FC15, FC14
and were all fresh installed as FC13, from memory. Perhaps
that has some bearing on it.

Below is a bunch of tests I just ran on one of them; fsck r/o, fsck rw/w,
mount r/o, mount r/w; all good.

--

# df
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 27354712K 4680956K 21302416K 19% /
devtmpfs 504932K 4K 504928K 1% /dev
tmpfs 513108K 0K 513108K 0% /dev/shm
/dev/mapper/VolGroup00-LogVol01 27354712K 4680956K 21302416K 19% /
tmpfs 513108K 40504K 472604K 8% /run
tmpfs 513108K 0K 513108K 0% /sys/fs/cgroup
tmpfs 513108K 0K 513108K 0% /media
/dev/md0 196877K 73096K 113542K 40% /boot

# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.1
  Creation Time : Fri Sep 16 08:30:27 2011
     Raid Level : raid1
     Array Size : 35838908 (34.18 GiB 36.70 GB)
  Used Dev Size : 35838908 (34.18 GiB 36.70 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Wed Apr 11 01:48:37 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : hostname.domain:1 (local to host hostname.domain)
           UUID : a2b3dacc:8163523e:20813db4:2b122e3d
         Events : 7261

    Number Major Minor RaidDevice State
       0 8 1 0 active sync /dev/sda1
       1 8 17 1 active sync /dev/sdb1

# pvdisplay
  --- Physical volume ---
  PV Name /dev/md1
  VG Name VolGroup00
  PV Size 34.18 GiB / not usable 22.93 MiB
  Allocatable yes
  PE Size 32.00 MiB
  Total PE 1093
  Free PE 128
  Allocated PE 965
  PV UUID d2CXHD-Aumo-shdQ-p8Bm-ZpQz-7i1T-5ZqlM1

# lvdisplay
  --- Logical volume ---
  LV Name /dev/VolGroup00/LogVol01
  VG Name VolGroup00
  LV UUID pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access read/write
  LV Status available
  # open 1
  LV Size 26.16 GiB
  Current LE 837
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:0

  --- Logical volume ---
  LV Name /dev/VolGroup00/LogVol00
  VG Name VolGroup00
  LV UUID cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access read/write
  LV Status available
  # open 2
  LV Size 4.00 GiB
  Current LE 128
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:1

# lvcr...

Just FYI - you perhaps need to disable sandbox service and reboot if you see the problem still, see https://bugzilla.redhat.com/show_bug.cgi?id=808795#c31

Interesting. This is probably why it doesn't affect me as
all the boxes I use LVM snapshots on either don't have sandbox
enabled (probably related to them not having it enabled in prior
fedoras and were upgraded) or are servers running in state 3.
(sandbox only seems to be enabled in state 5 on fresh installed F16 boxes)

The "retry_deactivation" lvm.conf option is included in lvm2 v2.02.89 and later. If set, this setting causes the LVM to retry volume removal several times if not successful (this option is used by default).

The same logic works for dmsetup remove command. You can use "--retry" option there.

I'm closing this bug with NEXTRELEASE as lvm2 version >= 2.02.89 is part of newer Fedora releases only (Fedora >= 17).

Changed in lvm2 (Debian):
status: New → Fix Released
Changed in lvm2 (Fedora):
importance: Unknown → High
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.