pvmove wipes data when issue_discards=1 on SSD

Bug #1082325 reported by Laurent Bigonville on 2012-11-23
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
lvm2
Fix Released
Critical
lvm2 (Debian)
Fix Released
Unknown
lvm2 (Ubuntu)
High
Dimitri John Ledkov
Quantal
Medium
Unassigned

Bug Description

[Impact]

 * Setting issue_discards=1 in /etc/lvm.conf (non-default) results in data loss, if pvmove is performed
on a Logical Volume which is moved to or form an SSD or other block devices which supports discards
 * As this bug *directly cause a loss of user data* this fix should be uploaded to quantal (lvm2 in precise is not effected, because it does not support the issue_discards option.

[Test Case]

 * Enable issue_discards=1 in /etc/lvm.conf
 * Create a volumegroup with two physical volumes (at least one of these must support discards (e.g. an SSD)
 * Create a test logical volume
 * Create a filesystem on this logical volume
 * With pvmove, move the underlying logical volume to the other physical volume
=> experience Data loss (in my experiments the whole logical volume was zeroed, checked with hexdumd /dev/vgtest/lvtest)

[Regression Potential]

 * The upstream fix is fairly self contained and separates discard and move operations.

The patches can be found at:

https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-June/000037.html
https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-June/000038.html
https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-June/000039.html

An SRU of just the upstream-*.patches from the -5 upload fix this bug.

Description of problem:

If user enables issue_discards=1 and runs PV move command, and underlaying devices supports discard he will loose data from move chunks.

Here is short log of what is going on:
(chunk is release and discarded prior it's move)

#metadata/lv_manip.c:3013 Creating logical volume pvmove0
#metadata/lv_manip.c:3938 Inserting layer pvmove0 for segments of lvol0 on /dev/loop0
#metadata/lv_manip.c:3852 Matched PE range /dev/loop0:0-126 against /dev/loop0 0 len 1
#metadata/lv_manip.c:3798 Inserting /dev/loop0:0-0 of test/lvol0
#libdm-config.c:853 Setting devices/issue_discards to 1
#device/device.c:428 Device /dev/loop0 queue/discard_max_bytes is 4294966784 bytes.
#device/device.c:428 Device /dev/loop0 queue/discard_granularity is 4096 bytes.
#metadata/pv_manip.c:223 Discarding 1 extents offset 2048 sectors on /dev/loop0.
#device/dev-io.c:577 Closed /dev/loop0
#device/dev-io.c:524 Opened /dev/loop0 RW O_DIRECT
#device/dev-io.c:318 Discarding 4194304 bytes offset 1048576 bytes on /dev/loop0.
#metadata/lv_manip.c:432 Stack lvol0:0[0] on LV pvmove0:0
#metadata/lv_manip.c:86 Adding lvol0:0 as an user of pvmove0
#pvmove.c:164 Moving 1 extents of logical volume test/lvol0
#mm/pool-fast.c:59 Created fast mempool allocation at 0x16f1810
#libdm-config.c:866 allocation/mirror_logs_require_separate_pvs not found in config: defaulting to 0
#libdm-config.c:866 allocation/maximise_cling not found in config: defaulting to 1
#metadata/pv_map.c:55 Allowing allocation on /dev/loop1 start PE 0 length 127
#metadata/lv_manip.c:967 Parallel PVs at LE 0 length 1: /dev/loop0
#metadata/lv_manip.c:2023 Trying allocation using contiguous policy.
#metadata/lv_manip.c:1635 Still need 1 total extents:
#metadata/lv_manip.c:1638 1 (1 data/0 parity) parallel areas of 1 extents each
#metadata/lv_manip.c:1640 0 mirror logs of 0 extents each
#metadata/lv_manip.c:1329 Considering allocation area 0 as /dev/loop1 start PE 0 length 1 leaving 126.
#metadata/lv_manip.c:1112 Allocating parallel area 0 on /dev/loop1 start PE 0 length 1.

Version-Release number of selected component (if applicable):
lvm 2.02.96

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

As a workaround - set issue_discards=0

This is critical bug btw, it means that you will lose your data completely if using pvmove with issue_discards=1.

Note that issue_discards is NOT enabled by default. So this only affects people who explicitly turned it on.

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.

    New Contents:
When issue_discards=1 is configured in the /etc/lvm/lvm.conf file, moving physical volumes via the pvmove command results in data loss. To work around this issue, ensure that issue_discards=0 is set in your lvm.conf file before moving any physical volumes.

Discards are issued when releasing a PV segment. But this code is used both when deleting the space AND when moving it elsewhere. (Need to check whether or not some lvconvert code paths are affected too.)

The code was added in 2.02.85, so RHEL6.2 might be affected too.

Adding QA ack for 6.4.

Devel will need to provide unit testing results however before this bug can be
ultimately verified by QA.

If I understand correctly, if this patch goes in, then the current Technical Note

("...issue_discards=1...results in data loss...ensure that issue_discards=0 is set in your lvm.conf file before moving any physical volumes.")

is not needed in 6.4.

It is needed in earlier releases. (Should we propose the fix for 6.3.z?)

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.

    Diffed Contents:
@@ -1 +1 @@
-When issue_discards=1 is configured in the /etc/lvm/lvm.conf file, moving physical volumes via the pvmove command results in data loss. To work around this issue, ensure that issue_discards=0 is set in your lvm.conf file before moving any physical volumes.+Without this fix, when issue_discards=1 is configured in the /etc/lvm/lvm.conf file, moving physical volumes via the pvmove command results in data loss.

Moving to Verified (SanityOnly).

All single machine pvmove regression tests passed.

kernel-2.6.32-330.el6.x86_64
lvm2-2.02.98-2.el6.x86_64
device-mapper-1.02.77-2.el6.x86_64

Changed in lvm2 (Ubuntu):
importance: Undecided → High
summary: - pvmove wipe data when issue_discards=1
+ pvmove wipes data when issue_discards=1 on SSD
Changed in lvm2 (Ubuntu):
assignee: nobody → Dmitrijs Ledkovs (xnox)
Changed in lvm2 (Debian):
status: Unknown → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (3.4 KiB)

This bug was fixed in the package lvm2 - 2.02.95-5ubuntu1

---------------
lvm2 (2.02.95-5ubuntu1) raring; urgency=low

  * Merge from Debian unstable, remaining changes (LP: #1082325):
    - debian/patches/avoid-dev-block.patch: Prefer any other device name over
      names in /dev/block/ since lvm.conf won't handle this.
    - debian/rules:
      - copy .po file to .pot file for Rosetta (Ubuntu specific).
    - debian/{dmsetup,lvm2}-udeb.install:
      - install initramfs and udev hooks in udebs (Debian bug 504341).
    - auto-start VGs as their PVs are discovered (Ubuntu specific):
      - add debian/tree/lvm2/lib/udev/rules.d/85-lvm2.rules: use watershed plus
        the sledgehammer of vgscan/vgchange to turn on VGs as they come online.
      - debian/tree/lvm2/usr/share/initramfs-tools/scripts/hooks/lvm2:
        - add 85-lvm2.rules to the list of udev rules to copy.
        - depend on udev.
      - debian/control:
        - add versioned Depend on watershed in lvm2 for udev rules.
        - add Depends on watershed-udeb in lvm2-udeb for udev rules.
        - add versioned Depend/Breaks on udev in dmsetup for udev rules.
        - add Depend on initramfs-tools in dmsetup so system is not potentially
          rendered unbootable by out-of-order dpkg configuration.
        - In libdevmapper-event1.02.1 add Breaks: dmeventd
          (<< 2.02.95-4ubuntu1) due to debian symbol rename
      - debian/rules:
        - do not install local-top scripts since Ubuntu mounts root using udev.
        - do not install init scripts for lvm2, since udev starts LVM.
      - debian/lvm2.postinst: handle missing lvm2 init script.
      - debian/tree/dmsetup/lib/udev/rules.d/60-persistent-storage-dm.rules:
        watch dm devices for changes with inotify
    - add mountroot failure hooks to help fix bad boots (Debian bug 468115):
      - debian/tree/lvm2/usr/share/initramfs-tools/scripts/init-premount/lvm2
    - remaining changes to upstream event manager packages (Debian bug 514706):
      - debian/rules:
        - enable dmeventd during configure.
      - debian/dmeventd.{8,manpages}: install dmeventd files.
    - rename debian/clvm.defaults to debian/clvm.default so it is installed
      correctly.
    - debian/control: add dmsetup-udeb to libdevmapper1.02.1-udeb recommends.
    - debian/rules: make sure dmsetup and lvm2 initramfs-tools scripts are
      executable. When the Ubuntu-specific ones are added with a patch,
      they may lose their executable bit.
    - Add and install clvmd resource agent
    - Add dependency on libudev-dev to libdevmapper-dev so that the .pc file
      works.
    - debian/{clvmd.ra,clvm.init}:
      - create /run/lvm if it doesn't exist.
    - debian/clvm.init:
      - exit 3 if not running on status action.
    - Call dh_installman so that our dmeventd manpage actually gets installed
    - Install the missing fsadm manpage.
    - Complete libdevmapper-dev multiarch:
      - move .so symlinks and pkgconfig files to multiarched locations.
      - mark libdevmapper-dev M-A: same

  * Dropped changes debian/lvm2.{preinst,postinst,postrm}, not needed in Raring:
    - Implement removal of obsolete /etc/init.d/lvm2 conffile...

Read more...

Changed in lvm2 (Ubuntu):
status: New → Fix Released
Arnd (arnd-arndnet) wrote :

Will this be fixed for quantal, too?

Arnd (arnd-arndnet) wrote :

Comments for SRU consideration (https://wiki.ubuntu.com/StableReleaseUpdates)

[Impact]

 * Setting issue_discards=1 in /etc/lvm.conf (non-default) results in data loss, if pvmove is performed
on a Logical Volume which is moved to or form an SSD or other block devices which supports discards
 * As this bug *directly cause a loss of user data* this fix should be uploaded to quantal (lvm2 in precise is not effected, because it does not support the issue_discards option.

[Test Case]

 * Enable issue_discards=1 in /etc/lvm.conf
 * Create a volumegroup with two physical volumes (at least one of these must support discards (e.g. an SSD)
 * Create a test logical volume
 * Create a filesystem on this logical volume
 * With pvmove, move the underlying logical volume to the other physical volume
=> experience Data loss (in my experiments the whole logical volume was zeroed, checked with hexdumd /dev/vgtest/lvtest)

[Regression Potential]

 * The upload 2.02.95-5ubuntu1 pulls in some other changes, however they do not seem be critical.

description: updated
Changed in lvm2 (Ubuntu Quantal):
status: New → Triaged
importance: Undecided → Medium

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0501.html

Rolf Leggewie (r0lf) wrote :

quantal has seen the end of its life and is no longer receiving any updates. Marking the quantal task for this ticket as "Won't Fix".

Changed in lvm2 (Ubuntu Quantal):
status: Triaged → Won't Fix
Changed in lvm2:
importance: Unknown → Critical
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.