lvm thin corruption after lvresize

Bug #1480923 reported by B.
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
New
Undecided
Unassigned

Bug Description

lvm2 version 2.02.98-6ubuntu2

After doing a lvresize of a LVM Thin Pool, I had a corruption all sub
LVM Thin Volumes and lost all of them. Then tried to dump/repair the
tmeta and end up with empty thin volumes (no more filesystem signature on them).

To sum up
The thin_pool was 2T and I tried to increased it to 3T...

As fare as I know, none of the partitions were full but I increased the main
thin pool as it was close to the sum of all sub thin volumes.

I assume that using LVM Thin is still not stable on 14.04 LTS right?

I guess that lvm2 2.02.98 does not properly handle the metadata resize
of a thin pool right? (maybe add a warning somewhere in doc?)

Maybe related to
http://comments.gmane.org/gmane.linux.kernel.device-mapper.devel/19190
https://www.redhat.com/archives/lvm-devel/2013-June/msg00371.html

I managed to recover some files from the raw thin_pool (tdata/tpool) with scalapel
but that is it.

Do you known any other tools to recovery lvm thin volumes or partition/data on it?

Errors

  attempt to access beyond end of device
  dm-6: rw=0, want=7753528, limit=262144
  attempt to access beyond end of device
  dm-6: rw=0, want=7753528, limit=262144
  attempt to access beyond end of device
  dm-6: rw=0, want=7753528, limit=262144
  attempt to access beyond end of device
  dm-6: rw=0, want=7753528, limit=262144

  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 2199023190016: Input/output error
  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 2199023247360: Input/output error
  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 0: Input/output error
  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 4096: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 805306302464: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 805306359808: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 0: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 4096: Input/output error

lvs
  LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
  thin_archive mainvg Vwi-aotz- 500.00g thin_pool 94.65
  thin_rsnapshot mainvg Vwi-aotz- 1.50t thin_pool 94.01
  thin_pool mainvg twi-a-tz- 3.00t 71.65

lvresize -L 2T /dev/mapper/mainvg-thin_rsnapshot
  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 1649267376128: Input/output error
  /dev/mainvg/thin_rsnapshot: read failed after 0 of 4096 at 1649267433472: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 536870846464: Input/output error
  /dev/mainvg/thin_archive: read failed after 0 of 4096 at 536870903808: Input/output error
  Extending logical volume thin_rsnapshot to 2.00 TiB
  Logical volume thin_rsnapshot successfully resized

Tags: lvm
affects: update-manager (Ubuntu) → lvm2 (Ubuntu)
description: updated
Revision history for this message
Peik (pakezonite) wrote :

Hi!

I'm not sure but this might be somehow related:
http://askubuntu.com/questions/758693/using-lvm-and-thin-provisioning-thinpool-on-a-pv-larger-than-2t

Just thought the thin pool size being over 2T might have something to do with it...

Also, I don't know if increasing the pool size automatically also increases the pool metadata portion so you might
want to check that so you're not running out.

Revision history for this message
mai ling (ml35) wrote :
Download full text (12.9 KiB)

I have just got bitten by this. I have deployed dozens of boxes with the same cloned disk image, so I expect more will hit me sooner or later. Does anyone if there is a Redhat bugzilla issue for it?

RHEL clone (OL8.4), kernel 5.4.17-2102.202.5.el8uek.x86_64

[root@localhost ~]# journalctl --since '2022-02-09 10:47:54' --until '2022-02-09 10:47:56' --no-pager
-- Logs begin at Fri 2021-04-09 13:02:56 EEST, end at Wed 2022-03-23 16:02:07 EET. --
Feb 09 10:47:54 localhost.localdomain systemd[1]: Starting Cleanup of Temporary Directories...
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -15
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -15
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -15
Feb 09 10:47:54 localhost.localdomain kernel: EXT4-fs error (device dm-10): __ext4_get_inode_loc:4713: inode #652801: block 2621472: comm systemd-tmpfile: unable to read itable block
Feb 09 10:47:54 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: thin: process_cell: dm_thin_find_block() failed: error = -15
Feb 09 10:47:55 localhost.localdomain kernel: Buffer I/O error on dev dm-10, logical block 0, lost sync page write
Feb 09 10:47:55 localhost.localdomain kernel: EXT4-fs (dm-10): I/O error while writing superblock
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:55 localhost.localdomain kernel: EXT4-fs warning (device dm-10): htree_dirblock_to_tree:997: inode #130564: lblock 0: comm systemd-tmpfile: error -5 reading directory block
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: btree spine: node_check failed: blocknr 10012793332687714485 != wanted 94
Feb 09 10:47:55 localhost.localdomain kernel: device-mapper: block manager: btree_node validator check failed for block 94
Feb 09 10:47:55 localhost.localdomain k...

Revision history for this message
mai ling (ml35) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.