Bug #533493 “lvremove fails” : Bugs : lvm2 package : Ubuntu

Revision history for this message

K Richard Pixley (rich-noir) wrote on 2010-03-06:

#1

Dependencies.txt Edit (1.4 KiB, text/plain; charset="utf-8")

Revision history for this message

Zdenek Kabelac (zdenek-kabelac) wrote on 2010-06-10:

#2

Looks like a 'famous' udev issue problem - currently it's still being resolved. There is no clear way, how to prevent 'random' device open from udev rules. 'watch' rule is usually the main source of troubles.

udev rules needs to be changed...

Revision history for this message

EAB (erwin-true) wrote on 2010-11-19:

#3

Ubuntu 10.04.1 LTS
Kerlen 2.6.32-25-server and 2.6.32-24-server x86_64
lvm2 2.02.54-1ubuntu4.1
udev 151-12.2

libvirt-bin 0.7.5-5ubuntu27.7
kvm 1:84+dfsg-0ubuntu16+0.12.3+noroms+0ubun

We're running VM's with KVM on several hosts. We're using a LV per VM.
Every night the LVM-volumes are snapshotted one-by-one. After saving the snapshotted volume to another server we need to remove the snapshot.

/sbin/lvremove -f /dev/someVG/snap-VMvolumeXYZ hangs sometimes.
On the 4 hosts we're using LVM this is the second time in 4 days.
It's impossible to kill this process and all other LVM-commands are freezing to from this moment.

I have to remove /dev/mapper/someVG-VMvolumeXYZ manualy to be able to use other LVM-commands again.

It seems that the devices is SUSPENDED.

# dmsetup info someVG-VMvolumeXYZ
Name: someVG-VMvolumeXYZ
State: SUSPENDED
/dev/mapper/hl-someVG-VMvolumeXYZ: open failed: No such file or directory
Tables present: None
Open count: 2
Event number: 0
Major, minor: 251, 9
Number of targets: 0
UUID: LVM-R6AybI9pE2adk8jBZuRc837oZl9Kh2k3p8WzNdpuyQT7zb1xfFb0pJ3CbdkNyx4K

Is there a way to remove this lock or suspended state? Rebooting the host will solve this probably, but thats not an option twice a week. There are 15 to 30 VM's running on these hosts.

Using a search-engine on this matters gives a lot of results going back to 2006. Seems to be exactly the same.
Link: http://readlist.com/lists/redhat.com/linux-lvm/0/422.html

Revision history for this message

EAB (erwin-true) wrote on 2010-11-19:

#4

I came across a PDF-document from HP. A recent white-paper on LVM snapshots.

http://h20000.www2.hp.com/bizsupport/TechSupport/CoreRedirect.jsp?redirectReason=DocIndexPDF&prodSeriesId=4296010&targetPage=http%3A%2F%2Fbizsupport2.austin.hp.com%2Fbc%2Fdocs%2Fsupport%2FSupportManual%2Fc02054539%2Fc02054539.pdf

Most interesting part:
"In very low system memory conditions, deletion of a single snapshot can hang indefinitely for memory to become available. Ensure that sufficient memory is available during deletion of a single snapshot that requires data to be copied to its predecessor. If the lvremove command hangs in these cases, increase the system memory or free some existing system memory to proceed with the snapshot deletion."

No further explaination is give....

Our host contains 64GB RAM and 2 6-core Intel CPU's.
We're using Munin to graph memory-usage. The graphs are updated every 5 minutes, so we don't have a real numbers on usage on the moment the snapshot was removed.
At the moment the removal of the snapshot was initiated the host used approximately 51GB RAM, 6GB buffers, 10GB unused and 3GB swap.

I'm thinking about some NUMA-issues I researched last weeks. It's probably nothing to do with this issue.

Some memory-statistics:
# free -m
total used free shared buffers cached
Mem: 64549 64062 487 0 23579 780
-/+ buffers/cache: 39702 24847
Swap: 7627 377 7250

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 32768 MB
node 0 free: 63 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 32758 MB
node 1 free: 437 MB
node distances:
node 0 1
0: 10 20

The host is swapping a little now, but every day it swaps out 4GB of RAM. vm.swappiness=0
swapoff -a && swapon -a is run every day a couple times.
It should not swap, but it seems to be an issue with multiple CPU sockets and processes not using the same NUMA-node (CPU-pinning). It seems that hosts with multiple sockets (not cores) swaps out a lot more.

It could be possible that the lvremove action thinks there is not enough ram and hangs indefinitely.

Hopefully someone can confirm some of this.

I came across a PDF-document from HP. A recent white-paper on LVM snapshots.

http://h20000.www2.hp.com/bizsupport/TechSupport/CoreRedirect.jsp?redirectReason=DocIndexPDF&prodSeriesId=4296010&targetPage=http%3A%2F%2Fbizsupport2.austin.hp.com%2Fbc%2Fdocs%2Fsupport%2FSupportManual%2Fc02054539%2Fc02054539.pdf

Most interesting part:
"In very low system memory conditions, deletion of a single snapshot can hang indefinitely for memory to become available. Ensure that sufficient memory is available during deletion of a single snapshot that requires data to be copied to its predecessor. If the lvremove command hangs in these cases, increase the system memory or free some existing system memory to proceed with the snapshot deletion."

No further explaination is give....

Our host contains 64GB RAM and 2 6-core Intel CPU's.
We're using Munin to graph memory-usage. The graphs are updated every 5 minutes, so we don't have a real numbers on usage on the moment the snapshot was removed.
At the moment the removal of the snapshot was initiated the host used approximately 51GB RAM, 6GB buffers, 10GB unused and 3GB swap.

I'm thinking about some NUMA-issues I researched last weeks. It's probably nothing to do with this issue.

Some memory-statistics:
# free -m
             total       used       free     shared    buffers     cached
Mem:         64549      64062        487          0      23579        780
-/+ buffers/cache:      39702      24847
Swap:         7627        377       7250

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
node 0 size: 32768 MB
node 0 free: 63 MB
node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
node 1 size: 32758 MB
node 1 free: 437 MB
node distances:
node   0   1 
  0:  10  20

The host is swapping a little now, but every day it swaps out 4GB of RAM. vm.swappiness=0
swapoff -a && swapon -a is run every day a couple times.
It should not swap, but it seems to be an issue with multiple CPU sockets and processes not using the same NUMA-node (CPU-pinning). It seems that hosts with multiple sockets (not cores) swaps out a lot more.

It could be possible that the lvremove action thinks there is not enough ram and hangs indefinitely.

Hopefully someone can confirm some of this.

Revision history for this message

EAB (erwin-true) wrote on 2010-11-19:

#5

ah, btw a accidental reboot fixed the locked LV.

Revision history for this message

In Red Hat Bugzilla #753105, Artur (artur-redhat-bugs) wrote on 2011-11-11:

#7

Download full text (15.0 KiB)

+++ This bug was initially created as a clone of Bug #712100 +++

+++ This bug was initially created as a clone of Bug #577798 +++

Description of problem:

# lvdisplay

  --- Logical volume ---
  LV Name /dev/vg01/.local.backup
  VG Name vg01
  LV UUID hkAyO4-M31g-LJw5-Kdcu-AfK1-Bquw-buVrWA
  LV Write Access read only
  LV snapshot status active destination for /dev/vg01/local
  LV Status available
  # open 0
  LV Size 2,00 GiB
  Current LE 512
  COW-table size 1,00 GiB
  COW-table LE 256
  Allocated to snapshot 0,01%
  Snapshot chunk size 4,00 KiB
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:29

# /dev/vg01/.local.backup
Can't remove open logical volume ".local.backup"

... repeating this several times ...

# lvremove /dev/vg01/.local.backup
Do you really want to remove active logical volume .local.backup? [y/n]: y
Logical volume ".local.backup" successfully removed

Version-Release number of selected component (if applicable):

kernel-2.6.33.1-19.fc13.x86_64
lvm2-2.02.61-1.fc13.x86_64

--- Additional comment from <email address hidden> on 2010-03-29 05:52:00 EDT ---

Created attachment 403250
strace in non-working case

--- Additional comment from <email address hidden> on 2010-03-29 05:52:28 EDT ---

Created attachment 403251
strace in working case

--- Additional comment from <email address hidden> on 2010-03-29 06:08:36 EDT ---

Do you have "udisks" package installed? There is one udev rule that could have possibly caused this...

For starters, just a quick check - could you please try to kill udev daemon temporarily and see if you can reproduce the problem? Thanks.

--- Additional comment from <email address hidden> on 2010-03-30 07:36:02 EDT ---

yes; udisks is installed and I can not reproduce the issue after its removal.

'udevadm control --stop-exec-queue' before lvremove seems to work too.

--- Additional comment from <email address hidden> on 2010-03-30 08:08:22 EDT ---

Just for the record, the rule we have problem supporting is this one exactly (in /lib/udev/rules.d/80-udisks.rules which is a part of udisks package):

# Make udevd synthesize a 'change' uevent when last opener of a rw-fd closes the fd - this
# should be part of the device-mapper rules
KERNEL=="dm-*", OPTIONS+="watch"

We have added udev synchronisation feature in device-mapper/lvm2 recently so we always wait until udev processing is settled down to cope with such problems where devices are accessed from within udev rules and also to provide a way to wait for nodes/symlinks to be created.

However, we can't synchronize with events synthesized as a result of this rule (like we can't with events originating in "udevadm trigger" which generates such events as well). The synchronisation could be done on events we know about (events originated in device-mapper itself).

There are still ongoing discussions with udev team to properly deal with this issue though...

Now, could you plea...

+++ This bug was initially created as a clone of Bug #712100 +++

+++ This bug was initially created as a clone of Bug #577798 +++

Description of problem:

# lvdisplay

--- Logical volume ---
  LV Name                /dev/vg01/.local.backup
  VG Name                vg01
  LV UUID                hkAyO4-M31g-LJw5-Kdcu-AfK1-Bquw-buVrWA
  LV Write Access        read only
  LV snapshot status     active destination for /dev/vg01/local
  LV Status              available
  # open                 0
  LV Size                2,00 GiB
  Current LE             512
  COW-table size         1,00 GiB
  COW-table LE           256
  Allocated to snapshot  0,01% 
  Snapshot chunk size    4,00 KiB
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:29

# /dev/vg01/.local.backup
  Can't remove open logical volume ".local.backup"

... repeating this several times ...

# lvremove  /dev/vg01/.local.backup
Do you really want to remove active logical volume .local.backup? [y/n]: y
  Logical volume ".local.backup" successfully removed

Version-Release number of selected component (if applicable):

kernel-2.6.33.1-19.fc13.x86_64
lvm2-2.02.61-1.fc13.x86_64

--- Additional comment from enrico.scholz@informatik.tu-chemnitz.de on 2010-03-29 05:52:00 EDT ---

Created attachment 403250
strace in non-working case

--- Additional comment from enrico.scholz@informatik.tu-chemnitz.de on 2010-03-29 05:52:28 EDT ---

Created attachment 403251
strace in working case

--- Additional comment from prajnoha@redhat.com on 2010-03-29 06:08:36 EDT ---

Do you have "udisks" package installed? There is one udev rule that could have possibly caused this...

For starters, just a quick check - could you please try to kill udev daemon temporarily and see if you can reproduce the problem? Thanks.

--- Additional comment from enrico.scholz@informatik.tu-chemnitz.de on 2010-03-30 07:36:02 EDT ---

yes; udisks is installed and I can not reproduce the issue after its removal.

'udevadm control --stop-exec-queue' before lvremove seems to work too.

--- Additional comment from prajnoha@redhat.com on 2010-03-30 08:08:22 EDT ---

Just for the record, the rule we have problem supporting is this one exactly (in /lib/udev/rules.d/80-udisks.rules which is a part of udisks package):

# Make udevd synthesize a 'change' uevent when last opener of a rw-fd closes the fd - this
# should be part of the device-mapper rules
KERNEL=="dm-*", OPTIONS+="watch"

We have added udev synchronisation feature in device-mapper/lvm2 recently so we always wait until udev processing is settled down to cope with such problems where devices are accessed from within udev rules and also to provide a way to wait for nodes/symlinks to be created.

However, we can't synchronize with events synthesized as a result of this rule (like we can't with events originating in "udevadm trigger" which generates such events as well). The synchronisation could be done on events we know about (events originated in device-mapper itself).

There are still ongoing discussions with udev team to properly deal with this issue though...

Now, could you please keep the udisks package and also keep udev running while having the "watch" rule commented out and see if the problem is gone? Thanks.

(..so we're really sure this is exactly the case and as a proof)

--- Additional comment from enrico.scholz@informatik.tu-chemnitz.de on 2010-03-30 08:58:11 EDT ---

yes; after commenting out this line I can not reproduce the error anymore.

--- Additional comment from prajnoha@redhat.com on 2010-03-30 09:20:24 EDT ---

So we have another nice and practical example how the "watch" rule disrupts an idea to properly synchronize with udev events...

--- Additional comment from prajnoha@redhat.com on 2010-03-30 09:26:06 EDT ---

...also, CC-ing David (the "udisks" package maintainer).

--- Additional comment from enrico.scholz@informatik.tu-chemnitz.de on 2010-05-13 06:19:18 EDT ---

still with

lvm2-2.02.61-1.fc13.x86_64
udisks-1.0.1-1.fc13.x86_64

--- Additional comment from prajnoha@redhat.com on 2010-05-17 09:07:01 EDT ---

(In reply to comment #9)
> still with
> 
> lvm2-2.02.61-1.fc13.x86_64
> udisks-1.0.1-1.fc13.x86_64

Unfortunately, we still don't have a proper solution for synchronization with the events like the ones originated in the "watch" rule.

--- Additional comment from bugzilla-redhat@brianneu.com on 2010-11-06 18:04:08 EDT ---

Is there a command that can be issues to assure that lvremove will work?

I've noticed that if I just run it repeatedly, it eventually works, but I don't exactly want to script a loop like that.

Or is commenting out the "watch" line the only solution available?

--- Additional comment from prajnoha@redhat.com on 2010-11-08 04:21:13 EST ---

(In reply to comment #11)
> Is there a command that can be issues to assure that lvremove will work?
> 
> I've noticed that if I just run it repeatedly, it eventually works, but I don't
> exactly want to script a loop like that.
> 
> Or is commenting out the "watch" line the only solution available?

For now, yes, this is the only solution we have, unfortunately. I'm sorry for any inconvenience that this brings. I'll provide an update as soon as we have a decent solution that will be acceptable for both for device-mapper and udev/udisks...

--- Additional comment from cyberrider@esmarshall.com on 2010-11-27 04:38:13 EST ---

I have a similar problem.
Scenario: Want to back IMAP mail spools and DB with minimum outage to cyrus-imap.

Solution: Halt Cyrus; Snapshot IMAP DB & mail spool file systems; mount the snapshot file systems as RO on some mount-point; restart Cyrus; backup RO file systems (ie snapshots); umount and throw away snapshots once backup complete.

Everything works fine until it's time to "lvremove -f ${VG}/${IMAPMB_SNAP} ${VG}/${IMAPDB_SNAP}"

The umount works but the lvremove fails, and "lvs" reports the snapshot LVs as being closed, but they're still showing as active.

I have a work-around that appears to work using "dmsetup".
(extracted from my script)

###### ====== extract from savemail script ====== ######
snap_off()
{
## First, force a flush of disc buffers
sync

## Now we dismount the snapshot file system copies
printf "Dismounting snapshot filesystems...\t"
umount ${SNAPROOT}/${IMAPDB_DIR} ${SNAPROOT}/${IMAPMB_DIR}
printf "Done!\n"

## Pause for a bit so the file systems can complete their dismount
sleep 10

## Flush any buffers to disc again - just to be sure
sync

## Wait another 10 seconds for everything to stabilise
sleep 10

### I have to use "dmsetup remove" to deactivate the snapshots first
for SNAPVOL in ${VG}-${IMAPMB_SNAP} ${VG}-${IMAPDB_SNAP}; do
  printf "Deactivating snapshot volume %s\n" ${SNAPVOL}
  dmsetup remove ${SNAPVOL}
  dmsetup remove ${SNAPVOL}-cow
## for some reason, the copy-on-write devices aren't cleaned up auto-magically
## so I have to remove them auto-manually.
done

## Okay - now we can remove the snapshot logical volumes
lvremove -f ${VG}/${IMAPMB_SNAP} ${VG}/${IMAPDB_SNAP}
}

###### ====== end of script extract ====== ######

I am now able to consistently and reliably tear down the snapshot volumes once my backup completes.

Personally, I would prefer a simpler command when working with snaphots.

Perhaps something like:
lvsnapshot accept <LogicalVolumePath> [<LogicalVolumePath>...]
and
lvsnapshot revert <LogicalVolumePath> [<LogicalVolumePath>...]

where "accept" means "throw away the snapshot and continue with all updates since snapshot created"
and "revert" means "throw all modifications since snapshot creation, and return to the point at which the snapshot was created"

A modifier flag (eg "-k" or "--keep") would retain the snapshot volume that had been created; this would allow an easy way to accept changes and re-checkpoint a snapshot; or in the case of "revert" allow you to revert to the file system as it was as many times as you like.  (Excellent in a classroom situation for example).

--- Additional comment from cyberrider@esmarshall.com on 2010-11-27 04:40:36 EST ---

Forgot to add:

package versions are as follows:

kernel-2.6.34.7-61.fc13.x86_64
lvm2-2.02.73-2.fc13.x86_64
udev-153-4.fc13.x86_64
udisks-1.0.1-4.fc13.x86_64

--- Additional comment from cyberrider@esmarshall.com on 2010-11-27 04:46:26 EST ---

And the device-mapper package is device-mapper-1.02.54-2.fc13.x86_64

--- Additional comment from pza@pza.net.au on 2011-01-04 11:16:48 EST ---

Also is a problem on RHEL 6.

--- Additional comment from iand@ekit-inc.com on 2011-02-20 19:45:22 EST ---

Also a problem on FC14

--- Additional comment from walter.haidinger@gmx.at on 2011-02-21 03:08:57 EST ---

On RHEL6, the workaround of comment #5 (commenting out the watch rule) does NOT work for me.
kernel-2.6.32-71.14.1.el6.x86_64
lvm2-2.02.72-8.el6_0.4.x86_64
udev-147-2.29.el6.x86_64
udisks-1.0.1-2.el6.x86_64

--- Additional comment from ndevos@redhat.com on 2011-02-21 08:43:18 EST ---

Any customers who see this issue on RHEL6 are advised to open a support case on the Red Hat Customer Portal at https://access.redhat.com and contact Red Hat Global Support Services from there.

The cloned Bug 638711 is the correct bug to follow for RHEL6, this is the one of Fedora.

--- Additional comment from wolfgang.pichler@ivv.tuwien.ac.at on 2011-03-14 05:39:22 EDT ---

kernel-2.6.35.11-83.fc14.i686
lvm2-2.02.73-3.fc14.i686
udev-161-8.fc14.i686
udisks-1.0.1-4.fc14.i686
device-mapper-1.02.54-3.fc14.i686

failing lvremove -f /dev/root-snap every week ;-(((

have now done the "commenting out" -

any status changes ???
(and tired of googling the issue)

--- Additional comment from iand@ekit-inc.com on 2011-03-14 06:49:26 EDT ---

My workaround was to add this to /etc/rc.local:

sed -e 's/^KERNEL=="dm-\*", OPTIONS+="watch"/#KERNEL=="dm-*", OPTIONS+="watch"/' < /lib/udev/rules.d/80-udisks.rules > /lib/udev/rules.d/80-udisks.rules.tmp
cp /lib/udev/rules.d/80-udisks.rules.tmp /lib/udev/rules.d/80-udisks.rules
rm /lib/udev/rules.d/80-udisks.rules.tmp

which keeps working after a yum update of the udisks package
(well until somebody changes that line...)

--- Additional comment from bugzilla@in-egypt.net on 2011-04-09 22:55:17 EDT ---

was trying to remove an lv and couldn't found this bug report and so i did the following:
[root@laptop ~]# lvremove -f /dev/vg_laptop/lv_home
  Can't remove open logical volume "lv_home"
[root@laptop ~]# while [ $? -eq "5" ]; do lvremove -f /dev/vg_laptop/lv_home ; done
  Can't remove open logical volume "lv_home"
  Can't remove open logical volume "lv_home"
  Can't remove open logical volume "lv_home"
  Logical volume "lv_home" successfully removed

voila!

--- Additional comment from cyberrider@esmarshall.com on 2011-04-10 01:28:29 EDT ---

(In reply to comment #22)
> was trying to remove an lv and couldn't found this bug report and so i did the
> following:
> [root@laptop ~]# lvremove -f /dev/vg_laptop/lv_home
>   Can't remove open logical volume "lv_home"
> [root@laptop ~]# while [ $? -eq "5" ]; do lvremove -f /dev/vg_laptop/lv_home ;
> done
>   Can't remove open logical volume "lv_home"
>   Can't remove open logical volume "lv_home"
>   Can't remove open logical volume "lv_home"
>   Logical volume "lv_home" successfully removed
> 
> 
> voila!

Yes, I've seen that too.
What you've done is retry enough times so that at the instant when udev/udisks don't have any open references to the logical volume, the lvremove command actually succeeds.

Unfortunately, I've seen an "lvremove -f" work on the initial attempt, and at other times never succeed (well "never" in that I stopped trying after a forced loop of 100 iterations).

This is why most of us are having to use the "dmsetup remove" workaround to remove the logical volume(s) once they've been unmounted, or otherwise closed.

--- Additional comment from zart@zartsoft.ru on 2011-05-01 06:16:28 EDT ---

# lvchange -v -an /dev/vg_name/lv_name
# lvremove -v /dev/vg_name/lv_name

works for me without force switches, hope might be useful as a workaround.

--- Additional comment from cyberrider@esmarshall.com on 2011-05-01 07:55:08 EDT ---

If you're scripting, or otherwise automating the lvremove, you will still need to use the "lvremove -f" option, otherwise you will be prompted whether you want to remove the referenced LV or not.

You don't want a script, especially one invoked through "at" or "cron" to prompt for a response.

--- Additional comment from prajnoha@redhat.com on 2011-05-30 04:28:24 EDT ---

We've applied a patch upstream that tries to minimize device RW open calls within the LVM itself. This should also prevent the events based on the watch rule from being fired when not necessary, at least with respect to internal LVM handling of devices:

https://www.redhat.com/archives/lvm-devel/2011-May/msg00025.html (LVM2 v2.02.86)

However, there's still a possibility that somone else, externally, will open a device for read-write and close it (which will cause the uevent to occur) just before the device is removed and so we could end up with the same problem as reported here - in this case, we have no control over this asynchronicity.

(For a hassle about the watch rule and more related information see also
https://bugzilla.redhat.com/show_bug.cgi?id=561424)

--- Additional comment from triage@lists.fedoraproject.org on 2011-06-02 11:48:27 EDT ---

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from iand@ekit-inc.com on 2011-06-02 19:32:25 EDT ---

Still present in FC14; how does one change the Version of the report?
(its not a link)

--- Additional comment from bugzilla-redhat@brianneu.com on 2011-06-02 20:03:21 EDT ---

Upper right corner, "Clone this bug" .  Save as a F14 bug.

--- Additional comment from prajnoha@redhat.com on 2011-06-23 10:47:05 EDT ---

*** Bug 715624 has been marked as a duplicate of this bug. ***

Still present in F15

lvm2-2.02.84-4.fc15.i686
kernel-PAE-2.6.40.6-0.fc15.i686
udev-167-6.fc15.i686
libuuid-2.19.1-1.4.fc15.i686

Revision history for this message

In Red Hat Bugzilla #753105, Peter (peter-redhat-bugs) wrote on 2011-11-14:

#8

(In reply to comment #0)
> +++ This bug was initially created as a clone of Bug #712100 +++
>
> Still present in F15

The original report is already for F15. I assume you meant F16 so I'm changing the version.

There are patches upstream for this (a new "retry_deactivation = 1" option in lvm.conf). This appears in lvm2 v2.02.89. However, this version is stil not released since it includes a lot of other (and more important) changes that still require some review. We'd like to test all these changes in Fedora rawhide first to avoid any other regressions. Once this is tested in rawhide, we'll backport this patch for other Fedora releases. I'm sorry for any inconvenience.

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2011-11-14:

#9

(In reply to comment #1)
« snip - snip »
> There are patches upstream for this (a new "retry_deactivation = 1" option in
> lvm.conf). This appears in lvm2 v2.02.89. However, this version is stil not

« snip - snip »

Peter,
I presume that the "retry_deactivation" is a boolean parameter to "retry the volume deactivation process" and not a count of "maximum deactivate retries"? The parameter name *is* open to interpretation.

Can you point us in the direction of a "NEW_FEATURES" document of the other you-beaut stuff coming in the later versions of lvm2? (I've found the Wiki and other doco, but it doesn't seem current - last updated in 2009 &/or 2010).

Cheers!

Revision history for this message

In Red Hat Bugzilla #753105, Enrico (enrico-redhat-bugs) wrote on 2011-11-24:

#10

It is nice to know what is coming soon. But how can I remove a volume under Fedora 16 in a reliable way?

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2011-11-24:

#11

(In reply to comment #3)
> It is nice to know what is coming soon. But how can I remove a volume under
> Fedora 16 in a reliable way?

For snapshot volumes, I've been performing the following which has worked reliably for me:

### I have to use "dmsetup remove" to deactivate the snapshots first
### Volume list for dmsetup looks like "vg-vol1 vg-vol2 vg-vol3" etc
### ie dmsetup uses hyphens to separate the VG component from the LV
for SNAPVOL in ${DM_VOLUME_LIST}; do
  printf "Deactivating snapshot volume %s\n" ${SNAPVOL}
  dmsetup remove ${SNAPVOL}
  dmsetup remove ${SNAPVOL}-cow
## for some reason, the copy-on-write devices aren't cleaned up auto-magically
## so I have to remove them auto-manually.
done

## Okay - now we can remove the snapshot logical volumes
## Volume list for lvremove looks like "vg/vol1 vg/vol2 vg/vol3" etc
## ie lvremove uses slashes to separate the VG component from the LV
lvremove -f ${LV_VOLUME_LIST}

The above is taken from a working script I use to snapshot cyrus-imap file systems (after quiescing cyrus first) so that they can be backed up and still let cyrus-imap operate.

I hope this gives you some ideas for your own needs.

Revision history for this message

In Red Hat Bugzilla #753105, Enrico (enrico-redhat-bugs) wrote on 2011-11-25:

#12

I use

run_lvremove() {
$DMSETUP remove "/dev/$1" || :
$DMSETUP remove "/dev/$1-cow" 2>/dev/null || :

    /sbin/udevadm control --stop-exec-queue || :
    $LVM lvchange $NOUDEVSYNC --quiet -an "$1" || :
    /sbin/udevadm control --start-exec-queue || :

$LVM lvremove --quiet -f "$1" && sleep 5
}

but still get

| Can't change snapshot logical volume ".nfs4.backup"
| LV vg01/.nfs4.backup in use: not deactivating
| Unable to deactivate logical volume ".nfs4.backup"

The leftover from this is not marked as a snapshot anymore:

# lvscan
  ACTIVE '/dev/vg01/nfs4' [4,00 MiB] inherit
  ACTIVE Original '/dev/vg01/data' [135,00 GiB] inherit
...
  ACTIVE '/dev/vg01/.nfs4.backup' [1,00 GiB] inherit
  ACTIVE '/dev/vg01/.virt.backup' [1,00 GiB] inherit
  ACTIVE Snapshot '/dev/vg01/.data.backup' [5,00 GiB] inherit

And somehow, subsequent 'mount' operations fail with obscure errors like

| mount: unknown filesystem type 'DM_snapshot_cow'

F16 rendered working with LVM nearly impossible :(

Revision history for this message

Launchpad Janitor (janitor) wrote on 2012-03-19:

#6

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lvm2 (Ubuntu):
status:	New → Confirmed

周成瑞 (e93b5ae3) on 2012-03-19

no longer affects:	lvm2 (Fedora)
Changed in lvm2 (Debian):
importance:	Undecided → Unknown
status:	New → Unknown

Bug Watch Updater (bug-watch-updater) on 2012-03-19

Changed in lvm2 (Debian):
status:	Unknown → New

Revision history for this message

In Red Hat Bugzilla #753105, Zdenek (zdenek-redhat-bugs) wrote on 2012-03-28:

#13

IMHO examples here are not valid - it can't be mixed lvm2 commands and dmsetup commands together they are incompatible in terms of udev synchronization.

So i.e. example in comment 5 isn't really good idea at all - what is this supposed to be doing ??

While saying this - I've a patch proposal for upstream inclusion - since current retry code only understands mounted ext4 and fuse filesystems and will not try retry mechanism for other filesystems - my patch proposal is a bit more smarter is it goes through /proc/mounts entries - but it's not smart enough yet I think - but should give much better results.

Revision history for this message

In Red Hat Bugzilla #753105, Zdenek (zdenek-redhat-bugs) wrote on 2012-03-28:

#14

Created attachment 573282
Patch for scanning /proc/mounts

Patch proposal to check /proc/mounts entries to find out, whether device is mounted.

Revision history for this message

In Red Hat Bugzilla #753105, Kay (kay-redhat-bugs) wrote on 2012-03-28:

#15

The file /proc/self/mountinfo contains the dev_t, it's generally safer to use.

Also be careful with stat(), it is a 'name' not a 'device'. It's not likely
to cause problems, but it's possible in theory. The name can contain anything,
it is copied verbatim from the mount() system call. It is recommended to use
lstat() here to avoid triggering the automounter in case you run over a
path name here.

Revision history for this message

In Red Hat Bugzilla #753105, Erik (erik-redhat-bugs) wrote on 2012-04-03:

#16

I read this from thread creation when it was F13. None of the tequniques worked for me, running F16, even once.

If I just create the snapshot, I can remove it. But, if I just mount it for a second, and then unmount it, I cannot remove it. It does not appear in /proc/mounts after unmounting.

When I try dmsetup remove, I just get:

# dmsetup remove vg_corsair-ss
device-mapper: remove ioctl failed: Device or resource busy
Command failed

I have not seen anyone report this when they tried dmsetup.

uname -a:

Linux localhost 3.3.0-4.fc16.x86_64 #1 SMP Tue Mar 20 18:05:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

lvm2-2.02.86-6.fc16.x86_64
system-config-lvm-1.1.16-4.fc16.noarch
lvm2-libs-2.02.86-6.fc16.x86_64
llvm-libs-2.9-6.fc16.x86_64

Let me know if you need any more diagnostics from me.

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-03:

#17

With F16 I also saw the new "not deactivating" error and had to
work around it by repeating the lvremove if it failed the first
time. I put upto 5 retries on the lvremove in a script and so far
its only had to be repeated once, every few days.

I also have the
KERNEL=="dm-*", OPTIONS+="watch"
line commented, in /lib/udev/rules.d/80-udisks.rules which I had to
add for F15 (or F14)...

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2012-04-03:

#18

I have a test machine running FC16 (x86_64), and I've run a number of snapshot tests without failure.

I'm going to attach a copy of my test script along with its output for people to review.

There have been no local changes to udev rules, ie the environment is essentially a bog-standard install, fully patched.

I will make one comment here, which may or may not have an impact.
I absolutely *HATE* gnome3 and have uninstalled the GNOME Desktop Environment because I find it inherently unusable.
There are some GNOME libraries for apps that need them, but the GNOME VFS package is NOT present. Whether this has an impact with udev etc, and due to the absence of GNOME VFS is allowing LVM to work correctly, I don't know.

Anyway, LVM & kernel packages for my system are shown here:

[root@central ~]# rpm -qa | egrep -ie '(lvm)|(kernel)' | sort -fu
abrt-addon-kerneloops-2.0.7-2.fc16.x86_64
kernel-3.2.10-3.fc16.x86_64
kernel-3.2.9-1.fc16.x86_64
kernel-3.2.9-2.fc16.x86_64
kernel-3.3.0-4.fc16.x86_64
kernel-3.3.0-8.fc16.x86_64
kernel-headers-3.3.0-8.fc16.x86_64
libreport-plugin-kerneloops-2.0.8-4.fc16.x86_64
llvm-libs-2.9-9.fc16.i686
llvm-libs-2.9-9.fc16.x86_64
lvm2-2.02.86-6.fc16.x86_64
lvm2-libs-2.02.86-6.fc16.x86_64

[root@central ~]# uname -a
Linux central.treetops 3.3.0-8.fc16.x86_64 #1 SMP Thu Mar 29 18:37:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2012-04-03:

#19

Created attachment 574829
Quick & Dirty test snapshot script

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2012-04-03:

#20

Created attachment 574830
Output from the snapshot test script

Revision history for this message

In Red Hat Bugzilla #753105, Scott (scott-redhat-bugs) wrote on 2012-04-03:

#21

Here is an extract from /var/log/messages showing the logged events pertaining to the last 4 runs of my script:

Apr 3 21:05:07 central kernel: [176887.592380] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:07 central kernel: [176887.592389] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:07 central kernel: [176887.592395] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:08 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:05:08 central kernel: [176888.183588] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:05:26 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:05:56 central kernel: [176936.489527] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central kernel: [176936.489536] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central kernel: [176936.489542] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:05:56 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:05:57 central kernel: [176937.026962] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:06:15 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:17:27 central kernel: [177627.082392] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central kernel: [177627.082401] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central kernel: [177627.082408] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:17:27 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:17:28 central kernel: [177627.966626] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:17:47 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr 3 21:17:51 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr 3 21:18:21 central kernel: [177681.486416] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central kernel: [177681.486421] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central kernel: [177681.486424] lvcreate: sending ioctl 1261 to a partition!
Apr 3 21:18:21 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr 3 21:18:22 central kernel: [177682.067903] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr 3 21:18:42 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr 3 21:18:45 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog

Here is an extract from /var/log/messages showing the logged events pertaining to the last 4 runs of my script:

Apr  3 21:05:07 central kernel: [176887.592380] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:07 central kernel: [176887.592389] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:07 central kernel: [176887.592395] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:08 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr  3 21:05:08 central kernel: [176888.183588] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr  3 21:05:26 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr  3 21:05:56 central kernel: [176936.489527] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:56 central kernel: [176936.489536] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:56 central kernel: [176936.489542] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:05:56 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr  3 21:05:57 central kernel: [176937.026962] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr  3 21:06:15 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr  3 21:17:27 central kernel: [177627.082392] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:17:27 central kernel: [177627.082401] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:17:27 central kernel: [177627.082408] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:17:27 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr  3 21:17:28 central kernel: [177627.966626] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr  3 21:17:47 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr  3 21:17:51 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog
Apr  3 21:18:21 central kernel: [177681.486416] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:18:21 central kernel: [177681.486421] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:18:21 central kernel: [177681.486424] lvcreate: sending ioctl 1261 to a partition!
Apr  3 21:18:21 central lvm[18439]: Monitoring snapshot vg00-snap_varlog
Apr  3 21:18:22 central kernel: [177682.067903] EXT4-fs (dm-15): mounted filesystem with ordered data mode. Opts: (null)
Apr  3 21:18:42 central lvm[18439]: Extension of snapshot vg00/snap_varlog finished successfully.
Apr  3 21:18:45 central lvm[18439]: No longer monitoring snapshot vg00-snap_varlog

Revision history for this message

In Red Hat Bugzilla #753105, Erik (erik-redhat-bugs) wrote on 2012-04-10:

#22

Commenting out the rule does not help me at all. Retrying 1000 times does not help either. The only way I can remove the snapshot is by rebooting first.

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-10:

#23

Can't say I've used the dmsetup commands; have you tried using the lvm commands instead?

Revision history for this message

In Red Hat Bugzilla #753105, Erik (erik-redhat-bugs) wrote on 2012-04-10:

#24

I normally only use the lvm commands. lvremove doesn't work without a reboot, either.

I use snapshots to create consistent backups, so my backup is a bit crippled right now on this computer. Works great on another computer where I launch from Ubuntu 10.

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-10:

#25

This is an excerpt of what I use for backups using lvm snapshots... works fine on FC16.

We retain a few GB of free space in the LVM storage for the snapshot,
which generally suffices on most systems unless there is a lot
of updates happening during backups, which is rare on the systems
we use this on.

--
DTYPE=$1
# NB: level contains level and 'u' option (eg: 0u)
LEVEL=$2
TAPE=$3
FS=$4
RHOST=$5

case "${TAPE}${RHOST}" in
--)
        # dumping to stdout
        RTAPE="-"
        ;;
*)
        # using rsh/rmt(8)
        RTAPE="${RHOST}:${TAPE}"
        ;;
esac

LVCREATE=""
if [ -x /sbin/lvcreate ]
then
        LVCREATE="/sbin/lvcreate"
        LVREMOVE="/sbin/lvremove"
        LVDISPLAY="/sbin/lvdisplay"
elif [ -x /usr/sbin/lvcreate ]
then
        LVCREATE="/usr/sbin/lvcreate"
        LVREMOVE="/usr/sbin/lvremove"
        LVDISPLAY="/usr/sbin/lvdisplay"
fi

if [ "`df $FS | grep /dev/mapper`" -a "$LVCREATE" != "" ] ; then
        DUMPDEV=`df $FS |grep mapper | cut -d/ -f4 | cut -d' ' -f1 | tr - /`
        VOL=`echo $DUMPDEV | cut -d/ -f1`
        SNAPVOL=`echo $DUMPDEV | cut -d / -f2`-snap
        SNAPVOL2=`echo $DUMPDEV | cut -d / -f2`--snap
        SNAPDEV=/dev/$VOL/$SNAPVOL
        SNAPRDEV=/dev/mapper/$VOL-$SNAPVOL2
        echo DUMPDEV=$DUMPDEV 1>&2
        echo VOL=$VOL 1>&2
        echo SNAPVOL=$SNAPVOL 1>&2
        echo SNAPVOL2=$SNAPVOL2 1>&2
        echo SNAPDEV=$SNAPDEV 1>&2
        echo SNAPRDEV=$SNAPRDEV 1>&2

# cleanup from last backup
$LVREMOVE -f $SNAPDEV >/dev/null 2>&1

        echo `date` starting snapshot 1>&2
        $LVCREATE -l 100%FREE -s -n $SNAPVOL /dev/$DUMPDEV 1>&2
        echo `date` starting backup 1>&2

dump "${LEVEL}bfL" 32 "$RTAPE" "$FS" $SNAPRDEV

# workaround FC16 bug; delay before clearing
sleep 5

echo `date` clearing snapshot 1>&2
$LVREMOVE -f $SNAPDEV 1>&2

        # workaround FC16 bug; do it again if needed
        for i in 1 2 3 4 5
        do
                $LVDISPLAY | grep $SNAPDEV >/dev/null
                if [ $? = 0 ]
                then
                        # its still there!
                        sleep 5
                        echo `date` clearing snapshot again 1>&2
                        $LVREMOVE -f $SNAPDEV 1>&2
                else
                        break
                fi
        done
        $LVDISPLAY | grep $SNAPDEV >/dev/null
        if [ $? = 0 ]
        then
                echo `date` gave up clearing snapshot - manual intervention required 1>&2
                STATUS=1
        fi

exit $STATUS

This is an excerpt of what I use for backups using lvm snapshots... works fine on FC16.

We retain a few GB of free space in the LVM storage for the snapshot,
which generally suffices on most systems unless there is a lot
of updates happening during backups, which is rare on the systems
we use this on.

--
DTYPE=$1 
# NB: level contains level and 'u' option (eg: 0u)
LEVEL=$2
TAPE=$3
FS=$4 
RHOST=$5

case "${TAPE}${RHOST}" in
--)
        # dumping to stdout
        RTAPE="-"
        ;;
*)
        # using rsh/rmt(8)
        RTAPE="${RHOST}:${TAPE}"
        ;;
esac

LVCREATE=""
if [ -x /sbin/lvcreate ]
then
        LVCREATE="/sbin/lvcreate"
        LVREMOVE="/sbin/lvremove"
        LVDISPLAY="/sbin/lvdisplay"
elif [ -x /usr/sbin/lvcreate ]
then
        LVCREATE="/usr/sbin/lvcreate"
        LVREMOVE="/usr/sbin/lvremove"
        LVDISPLAY="/usr/sbin/lvdisplay"
fi

if [ "`df $FS | grep /dev/mapper`" -a "$LVCREATE" != "" ] ; then
        DUMPDEV=`df $FS |grep mapper | cut -d/ -f4 | cut -d' ' -f1 | tr - /`
        VOL=`echo $DUMPDEV | cut -d/ -f1`
        SNAPVOL=`echo $DUMPDEV | cut -d / -f2`-snap
        SNAPVOL2=`echo $DUMPDEV | cut -d / -f2`--snap
        SNAPDEV=/dev/$VOL/$SNAPVOL
        SNAPRDEV=/dev/mapper/$VOL-$SNAPVOL2
        echo DUMPDEV=$DUMPDEV 1>&2
        echo VOL=$VOL 1>&2
        echo SNAPVOL=$SNAPVOL 1>&2
        echo SNAPVOL2=$SNAPVOL2 1>&2
        echo SNAPDEV=$SNAPDEV 1>&2
        echo SNAPRDEV=$SNAPRDEV 1>&2

# cleanup from last backup
        $LVREMOVE -f $SNAPDEV >/dev/null 2>&1

echo `date` starting snapshot 1>&2
        $LVCREATE -l 100%FREE -s -n $SNAPVOL /dev/$DUMPDEV 1>&2
        echo `date` starting backup 1>&2
        
        dump "${LEVEL}bfL" 32 "$RTAPE" "$FS" $SNAPRDEV

# workaround FC16 bug; delay before clearing
        sleep 5

echo `date` clearing snapshot 1>&2
        $LVREMOVE -f $SNAPDEV 1>&2

# workaround FC16 bug; do it again if needed
        for i in 1 2 3 4 5
        do
                $LVDISPLAY | grep $SNAPDEV >/dev/null
                if [ $? = 0 ]
                then
                        # its still there!
                        sleep 5
                        echo `date` clearing snapshot again 1>&2
                        $LVREMOVE -f $SNAPDEV 1>&2
                else
                        break
                fi
        done
        $LVDISPLAY | grep $SNAPDEV >/dev/null
        if [ $? = 0 ]
        then
                echo `date` gave up clearing snapshot - manual intervention required 1>&2
                STATUS=1
        fi

exit $STATUS

Revision history for this message

In Red Hat Bugzilla #753105, Mark (mark-redhat-bugs) wrote on 2012-04-10:

#26

I'm experiencing the same problem, looping the lvremove statement doesn't succeed, the snapshot can only be removed after a reboot. I notice also a similar problem with fsck on logical volumes - once the volume has been mounted once, and unmounted, it cannot be checked with fsck (e2fsck).

# fsck /dev/mapper/Backup-Home--backup
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/mapper/Backup-Home--backup
Filesystem mounted or opened exclusively by another program?

# mount | grep Backup | wc -l
0

Could this be a clue?

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-10:

#27

what happens if you use fsck -n (read-only open) ?

Revision history for this message

In Red Hat Bugzilla #753105, Milan (milan-redhat-bugs) wrote on 2012-04-10:

#28

Also see bug 809188 - it seems there is some more generic bug, not related to lvm.

Revision history for this message

In Red Hat Bugzilla #753105, Mark (mark-redhat-bugs) wrote on 2012-04-10:

#29

(In reply to comment #20)
> what happens if you use fsck -n (read-only open) ?

# fsck -n /dev/mapper/Backup-Home--backup
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/mapper/Backup-Home--backup
Filesystem mounted or opened exclusively by another program?

In reply to comment #21, ref bug 809188 I do use gnome-shell, but my backup script runs overnight when there usually isn't a user logged in.

Revision history for this message

In Red Hat Bugzilla #753105, Mark (mark-redhat-bugs) wrote on 2012-04-10:

#30

Sorry, that should have been ref bug 808795.

Revision history for this message

In Red Hat Bugzilla #753105, Norman (norman-redhat-bugs) wrote on 2012-04-10:

#31

I added comment #1 to bug 808795. It may be of interest to followers of this bug.

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-11:

#32

Download full text (14.6 KiB)

I can't see the problem on my FC16 servers; all of which use
md mirroring and lvm2.

However they were all upgraded via yum distro-sync from FC15, FC14
and were all fresh installed as FC13, from memory. Perhaps
that has some bearing on it.

Below is a bunch of tests I just ran on one of them; fsck r/o, fsck rw/w,
mount r/o, mount r/w; all good.

--

# df
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 27354712K 4680956K 21302416K 19% /
devtmpfs 504932K 4K 504928K 1% /dev
tmpfs 513108K 0K 513108K 0% /dev/shm
/dev/mapper/VolGroup00-LogVol01 27354712K 4680956K 21302416K 19% /
tmpfs 513108K 40504K 472604K 8% /run
tmpfs 513108K 0K 513108K 0% /sys/fs/cgroup
tmpfs 513108K 0K 513108K 0% /media
/dev/md0 196877K 73096K 113542K 40% /boot

# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.1
  Creation Time : Fri Sep 16 08:30:27 2011
     Raid Level : raid1
     Array Size : 35838908 (34.18 GiB 36.70 GB)
  Used Dev Size : 35838908 (34.18 GiB 36.70 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

Intent Bitmap : Internal

    Update Time : Wed Apr 11 01:48:37 2012
          State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
  Spare Devices : 0

           Name : hostname.domain:1 (local to host hostname.domain)
           UUID : a2b3dacc:8163523e:20813db4:2b122e3d
         Events : 7261

    Number Major Minor RaidDevice State
       0 8 1 0 active sync /dev/sda1
       1 8 17 1 active sync /dev/sdb1

# pvdisplay
  --- Physical volume ---
  PV Name /dev/md1
  VG Name VolGroup00
  PV Size 34.18 GiB / not usable 22.93 MiB
  Allocatable yes
  PE Size 32.00 MiB
  Total PE 1093
  Free PE 128
  Allocated PE 965
  PV UUID d2CXHD-Aumo-shdQ-p8Bm-ZpQz-7i1T-5ZqlM1

# lvdisplay
  --- Logical volume ---
  LV Name /dev/VolGroup00/LogVol01
  VG Name VolGroup00
  LV UUID pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access read/write
  LV Status available
  # open 1
  LV Size 26.16 GiB
  Current LE 837
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:0

  --- Logical volume ---
  LV Name /dev/VolGroup00/LogVol00
  VG Name VolGroup00
  LV UUID cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access read/write
  LV Status available
  # open 2
  LV Size 4.00 GiB
  Current LE 128
  Segments 1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device 253:1

# lvcr...

I can't see the problem on my FC16 servers; all of which use
md mirroring and lvm2.

However they were all upgraded via yum distro-sync from FC15, FC14
and were all fresh installed as FC13, from memory.   Perhaps
that has some bearing on it.

Below is a bunch of tests I just ran on one of them; fsck r/o, fsck rw/w, 
mount r/o, mount r/w; all good.

--

# df 
Filesystem                      1K-blocks     Used Available Use% Mounted on
rootfs                          27354712K 4680956K 21302416K  19% /
devtmpfs                          504932K       4K   504928K   1% /dev
tmpfs                             513108K       0K   513108K   0% /dev/shm
/dev/mapper/VolGroup00-LogVol01 27354712K 4680956K 21302416K  19% /
tmpfs                             513108K   40504K   472604K   8% /run
tmpfs                             513108K       0K   513108K   0% /sys/fs/cgroup
tmpfs                             513108K       0K   513108K   0% /media
/dev/md0                          196877K   73096K   113542K  40% /boot

# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.1
  Creation Time : Fri Sep 16 08:30:27 2011
     Raid Level : raid1
     Array Size : 35838908 (34.18 GiB 36.70 GB)
  Used Dev Size : 35838908 (34.18 GiB 36.70 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Wed Apr 11 01:48:37 2012
          State : active 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

Name : hostname.domain:1  (local to host hostname.domain)
           UUID : a2b3dacc:8163523e:20813db4:2b122e3d
         Events : 7261

Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md1
  VG Name               VolGroup00
  PV Size               34.18 GiB / not usable 22.93 MiB
  Allocatable           yes 
  PE Size               32.00 MiB
  Total PE              1093
  Free PE               128
  Allocated PE          965
  PV UUID               d2CXHD-Aumo-shdQ-p8Bm-ZpQz-7i1T-5ZqlM1

# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01
  VG Name                VolGroup00
  LV UUID                pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                26.16 GiB
  Current LE             837
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                4.00 GiB
  Current LE             128
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
# lvcreate -l 100%FREE -s -n /dev/VolGroup00/LogVol01-snap /dev/VolGroup00/LogVol01
  Logical volume "LogVol01-snap" created
# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01
  VG Name                VolGroup00
  LV UUID                pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access        read/write
  LV snapshot status     source of
                         /dev/VolGroup00/LogVol01-snap [active]
  LV Status              available
  # open                 1
  LV Size                26.16 GiB
  Current LE             837
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                4.00 GiB
  Current LE             128
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01-snap
  VG Name                VolGroup00
  LV UUID                KjW4nK-EZIk-u80Y-vNbJ-0WoG-HfWS-fmBq3X
  LV Write Access        read/write
  LV snapshot status     active destination for /dev/VolGroup00/LogVol01
  LV Status              available
  # open                 0
  LV Size                26.16 GiB
  Current LE             837
  COW-table size         4.00 GiB
  COW-table LE           128
  Allocated to snapshot  0.00% 
  Snapshot chunk size    4.00 KiB
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
# fsck -n  /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166035/1716960 files, 1187687/6856704 blocks

# fsck -n  /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166035/1716960 files, 1187687/6856704 blocks

# fsck   /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
Clearing orphaned inode 926467 (uid=0, gid=0, mode=0100755, size=64684)
Clearing orphaned inode 924630 (uid=0, gid=0, mode=0100755, size=36784)
Clearing orphaned inode 1048220 (uid=0, gid=0, mode=0100755, size=195416)
Clearing orphaned inode 1180996 (uid=0, gid=0, mode=0100755, size=368404)
Clearing orphaned inode 920626 (uid=0, gid=0, mode=0100755, size=934428)
Clearing orphaned inode 1047692 (uid=0, gid=0, mode=0100755, size=11400)
Clearing orphaned inode 924628 (uid=0, gid=0, mode=0100755, size=34328)
Clearing orphaned inode 1047726 (uid=0, gid=0, mode=0100755, size=64448)
Clearing orphaned inode 1047669 (uid=0, gid=0, mode=0100755, size=113736)
Clearing orphaned inode 1047691 (uid=0, gid=0, mode=0100755, size=56588)
Clearing orphaned inode 1047718 (uid=0, gid=0, mode=0100755, size=95800)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166024/1716960 files, 1187201/6856704 blocks

# fsck   /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166024/1716960 files, 1187201/6856704 blocks

# mkdir /j
# mount -r /dev/VolGroup00/LogVol01-snap /j
# ll /j
total 164K
dr-xr-xr-x. 26 root root  4096 Mar 25 22:26 .
dr-xr-xr-x. 27 root root  4096 Apr 11 00:03 ..
-rw-r--r--.  1 root root 22345 Mar 25 22:26 .readahead
dr-xr-xr-x.  2 root root  4096 Mar 26 03:25 bin
drwxr-xr-x.  2 root root  4096 Sep 16  2011 boot
drwxr-xr-x.  2 root root  4096 Mar  3  2011 cgroup
drwxr-xr-x.  2 root root  4096 Sep 16  2011 dev
drwxr-xr-x. 96 root root 12288 Apr  4 03:31 etc
drwxr-xr-x. 11 root root  4096 Jul 29  2011 home
drwxr-xr-x.  2 root root  4096 Sep 16  2011 import
dr-xr-xr-x. 19 root root 12288 Mar 26 03:25 lib
drwx------.  2 root root 16384 Sep 16  2011 lost+found
drwxr-xr-x.  2 root root  4096 May 18  2011 media
drwxr-xr-x.  2 root root  4096 Jul 29  2011 mnt
drwxr-xr-x.  4 root root  4096 Jul 29  2011 opt
drwxr-xr-x.  2 root root  4096 Sep 16  2011 proc
dr-xr-x---.  3 root root  4096 Jul 29  2011 root
drwxr-xr-x. 19 root root  4096 Sep 16  2011 run
dr-xr-xr-x.  2 root root 12288 Mar 26 03:25 sbin
drwxr-xr-x.  2 root root  4096 Sep 16  2011 selinux
drwxr-xr-x.  2 root root  4096 Jul 29  2011 srv
drwxr-xr-x.  2 root root  4096 Sep 16  2011 sys
drwxrwxrwt. 15 root root  4096 Apr 11 00:01 tmp
drwxr-xr-x.  2 root root  4096 Sep 16  2011 tmp-build
drwxr-xr-x. 12 root root  4096 Mar 19 03:55 usr
drwxr-xr-x. 16 root root  4096 Mar 19 03:55 var
# touch /j/kkk
touch: cannot touch `/j/kkk': Read-only file system

# umount /j

# fsck   /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166024/1716960 files, 1187201/6856704 blocks

# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01
  VG Name                VolGroup00
  LV UUID                pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access        read/write
  LV snapshot status     source of
                         /dev/VolGroup00/LogVol01-snap [active]
  LV Status              available
  # open                 1
  LV Size                26.16 GiB
  Current LE             837
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                4.00 GiB
  Current LE             128
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01-snap
  VG Name                VolGroup00
  LV UUID                KjW4nK-EZIk-u80Y-vNbJ-0WoG-HfWS-fmBq3X
  LV Write Access        read/write
  LV snapshot status     active destination for /dev/VolGroup00/LogVol01
  LV Status              available
  # open                 0
  LV Size                26.16 GiB
  Current LE             837
  COW-table size         4.00 GiB
  COW-table LE           128
  Allocated to snapshot  0.03% 
  Snapshot chunk size    4.00 KiB
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
# lvremove -f /dev/VolGroup00/LogVol01-snap
  Logical volume "LogVol01-snap" successfully removed

# lvremove -f /dev/VolGroup00/LogVol01-snap
  One or more specified logical volume(s) not found.

# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol01
  VG Name                VolGroup00
  LV UUID                pHt4NT-V3fk-VyPf-dMqu-yIwC-nJHJ-TLKeKu
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                26.16 GiB
  Current LE             837
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VolGroup00/LogVol00
  VG Name                VolGroup00
  LV UUID                cR4csQ-C2vd-NBuq-mzG2-VYrp-NapC-M85fmb
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                4.00 GiB
  Current LE             128
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1

# lvcreate -l 100%FREE -s -n /dev/VolGroup00/LogVol01-snap /dev/VolGroup00/LogVol01
  Logical volume "LogVol01-snap" created

# fsck   /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
Clearing orphaned inode 926467 (uid=0, gid=0, mode=0100755, size=64684)
Clearing orphaned inode 924630 (uid=0, gid=0, mode=0100755, size=36784)
Clearing orphaned inode 1048220 (uid=0, gid=0, mode=0100755, size=195416)
Clearing orphaned inode 1180996 (uid=0, gid=0, mode=0100755, size=368404)
Clearing orphaned inode 920626 (uid=0, gid=0, mode=0100755, size=934428)
Clearing orphaned inode 1047692 (uid=0, gid=0, mode=0100755, size=11400)
Clearing orphaned inode 924628 (uid=0, gid=0, mode=0100755, size=34328)
Clearing orphaned inode 1047726 (uid=0, gid=0, mode=0100755, size=64448)
Clearing orphaned inode 1047669 (uid=0, gid=0, mode=0100755, size=113736)
Clearing orphaned inode 1047691 (uid=0, gid=0, mode=0100755, size=56588)
Clearing orphaned inode 1047718 (uid=0, gid=0, mode=0100755, size=95800)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166027/1716960 files, 1187211/6856704 blocks

# fsck   /dev/VolGroup00/LogVol01-snap
fsck from util-linux 2.20.1
e2fsck 1.41.14 (22-Dec-2010)
/dev/mapper/VolGroup00-LogVol01--snap: clean, 166027/1716960 files, 1187211/6856704 blocks

# mount /dev/VolGroup00/LogVol01-snap /j

# ll /j
total 168K
dr-xr-xr-x. 27 root root  4096 Apr 11 00:05 .
dr-xr-xr-x. 27 root root  4096 Apr 11 00:05 ..
-rw-r--r--.  1 root root 22345 Mar 25 22:26 .readahead
dr-xr-xr-x.  2 root root  4096 Mar 26 03:25 bin
drwxr-xr-x.  2 root root  4096 Sep 16  2011 boot
drwxr-xr-x.  2 root root  4096 Mar  3  2011 cgroup
drwxr-xr-x.  2 root root  4096 Sep 16  2011 dev
drwxr-xr-x. 96 root root 12288 Apr  4 03:31 etc
drwxr-xr-x. 11 root root  4096 Jul 29  2011 home
drwxr-xr-x.  2 root root  4096 Sep 16  2011 import
drwxr-xr-x.  2 root root  4096 Apr 11 00:03 j
dr-xr-xr-x. 19 root root 12288 Mar 26 03:25 lib
drwx------.  2 root root 16384 Sep 16  2011 lost+found
drwxr-xr-x.  2 root root  4096 May 18  2011 media
drwxr-xr-x.  2 root root  4096 Jul 29  2011 mnt
drwxr-xr-x.  4 root root  4096 Jul 29  2011 opt
drwxr-xr-x.  2 root root  4096 Sep 16  2011 proc
dr-xr-x---.  3 root root  4096 Jul 29  2011 root
drwxr-xr-x. 19 root root  4096 Sep 16  2011 run
dr-xr-xr-x.  2 root root 12288 Mar 26 03:25 sbin
drwxr-xr-x.  2 root root  4096 Sep 16  2011 selinux
drwxr-xr-x.  2 root root  4096 Jul 29  2011 srv
drwxr-xr-x.  2 root root  4096 Sep 16  2011 sys
drwxrwxrwt. 15 root root  4096 Apr 11 00:07 tmp
drwxr-xr-x.  2 root root  4096 Sep 16  2011 tmp-build
drwxr-xr-x. 12 root root  4096 Mar 19 03:55 usr
drwxr-xr-x. 16 root root  4096 Mar 19 03:55 var

# touch /j/kkk
# ll /j/kkk
-rw-r--r--. 1 root root 0 Apr 11 00:08 /j/kkk
# umount /j
# mount /dev/VolGroup00/LogVol01-snap /j
# ll /j/kkk
-rw-r--r--. 1 root root 0 Apr 11 00:08 /j/kkk
# umount /j
# rmdir /j

# lvremove -f /dev/VolGroup00/LogVol01-snap
  Logical volume "LogVol01-snap" successfully removed
# !!
lvremove -f /dev/VolGroup00/LogVol01-snap
  One or more specified logical volume(s) not found.

# rpm -q lvm2
lvm2-2.02.86-6.fc16.i686
# rpm -q kernel-PAE
kernel-PAE-2.6.42.9-1.fc15.i686
kernel-PAE-3.2.10-3.fc16.i686
kernel-PAE-3.3.0-4.fc16.i686
# uname -r
3.3.0-4.fc16.i686.PAE

Revision history for this message

In Red Hat Bugzilla #753105, Milan (milan-redhat-bugs) wrote on 2012-04-13:

#33

Just FYI - you perhaps need to disable sandbox service and reboot if you see the problem still, see https://bugzilla.redhat.com/show_bug.cgi?id=808795#c31

Revision history for this message

In Red Hat Bugzilla #753105, Ian (ian-redhat-bugs) wrote on 2012-04-14:

#34

Interesting. This is probably why it doesn't affect me as
all the boxes I use LVM snapshots on either don't have sandbox
enabled (probably related to them not having it enabled in prior
fedoras and were upgraded) or are servers running in state 3.
(sandbox only seems to be enabled in state 5 on fresh installed F16 boxes)

Revision history for this message

In Red Hat Bugzilla #753105, Peter (peter-redhat-bugs) wrote on 2013-01-16:

#35

The "retry_deactivation" lvm.conf option is included in lvm2 v2.02.89 and later. If set, this setting causes the LVM to retry volume removal several times if not successful (this option is used by default).

The same logic works for dmsetup remove command. You can use "--retry" option there.

I'm closing this bug with NEXTRELEASE as lvm2 version >= 2.02.89 is part of newer Fedora releases only (Fedora >= 17).

Bug Watch Updater (bug-watch-updater) on 2015-08-14

Changed in lvm2 (Debian):
status:	New → Fix Released

Bug Watch Updater (bug-watch-updater) on 2017-10-28

Changed in lvm2 (Fedora):
importance:	Unknown → High
status:	Unknown → Fix Released

Ubuntu
lvm2 package

lvremove fails

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
lvm2 (Debian)	Fix Released	Unknown	debbugs #549691
lvm2 (Fedora)	Fix Released	High	redhat-bugs #753105
lvm2 (Ubuntu)	Confirmed	Undecided	Unassigned

Ubuntulvm2 package

lvremove fails

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
lvm2 package