LVM install broken if other disks have meta-data on the VG name already

Bug #1905412 reported by Christian Ehrhardt 
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Undecided
Ubuntu on IBM Power Systems Bug Triage
Ubuntu on IBM z Systems
Fix Released
Undecided
Skipper Bug Screeners
subiquity
Fix Released
Undecided
Unassigned
curtin (Ubuntu)
Confirmed
Undecided
Canonical Foundations Team

Bug Description

Hi,
I was puzzled today at my install aborting until I was looking in the crash file.
There I've found:
 Running command ['vgcreate', '--force', '--zero=y', '--yes', 'ubuntu-vg', '/dev/dasda2'] with allowed return codes [0] (capture=True)
 An error occured handling 'lvm_volgroup-0': ProcessExecutionError - Unexpected error while running command.
 Command: ['vgcreate', '--force', '--zero=y', '--yes', 'ubuntu-vg', '/dev/dasda2']
 Exit code: 5
 Reason: -
 Stdout: ''
 Stderr: A volume group called ubuntu-vg already exists.

And now things fall into place.

I've had a default vg as the installer creates it across a few disks.
Then later my main root disk was broken and I replaced it.

Now at install time I have activated all disks that I eventually wanted to use (this is s390x therefore activate disks, but from the bit I see in the crash I'd expect no other behavior if on e.g. x86 you'd replace one disk and try to re-install).

What happens is that the disks I'm not installing onto still have LVM metadata.
That has the "ubuntu-vg" defined and thereby crashes the install.

I think we will need to harden the installer that probably needs to wipe some signatures and re-probe LVM to then get things going.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

Even if I think that this is not s390x-specific, I'm marking this as affecting s390x, since one bumped into this while doing an (LPAR) install on s390x.

tags: added: installer s390x
no longer affects: subiquity (Ubuntu)
Changed in ubuntu-z-systems:
assignee: nobody → Canonical Foundations Team (canonical-foundations)
description: updated
Steve Langasek (vorlon)
tags: added: fr-964
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I think the issue here is that the guided lvm install always creates a vg named ubuntu-vg -- even if there is already a vg with that name on other disks. Installing over the top of the disk that has the ubuntu-vg vg on it *ought* to work but I wouldn't be that surprised if it didn't.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI I just looked at this system after installing with only one disk enabled.

I found this which might be interesting for LVM handling:

#1 two LVMs with same name co-exist
$ sudo vgs -o vg_name,vg_uuid
  VG VG UUID
  ubuntu-vg 6j9dUF-KO8t-7Svv-RRze-VcMu-pMUY-FJxgA8
  ubuntu-vg bKYrCn-gMDT-UFAW-XUlX-oRVS-hUxr-ksINed

Usual commands would fail:
$ sudo vgremove ubuntu-vg
  Multiple VGs found with the same name: skipping ubuntu-vg
  Use --select vg_uuid=<uuid> in place of the VG name.

We can look at the UUIDs and maybe even know which one created.

$ sudo vgdisplay
  --- Volume group ---
  VG Name ubuntu-vg
  System ID
  Format lvm2
  Metadata Areas 2
  Metadata Sequence No 6
  VG Access read/write
  VG Status resizable
  MAX LV 0
  Cur LV 1
  Open LV 0
  Max PV 0
  Cur PV 4
  Act PV 2
  VG Size <169.12 GiB
  PE Size 4.00 MiB
  Total PE 43294
  Alloc PE / Size 43294 / <169.12 GiB
  Free PE / Size 0 / 0
  VG UUID 6j9dUF-KO8t-7Svv-RRze-VcMu-pMUY-FJxgA8

  --- Volume group ---
  VG Name ubuntu-vg
  System ID
  Format lvm2
  Metadata Areas 1
  Metadata Sequence No 2
  VG Access read/write
  VG Status resizable
  MAX LV 0
  Cur LV 1
  Open LV 1
  Max PV 0
  Cur PV 1
  Act PV 1
  VG Size <19.63 GiB
  PE Size 4.00 MiB
  Total PE 5025
  Alloc PE / Size 5025 / <19.63 GiB
  Free PE / Size 0 / 0
  VG UUID bKYrCn-gMDT-UFAW-XUlX-oRVS-hUxr-ksINed

In my case for example I can actually remove one of them via:

$ sudo vgremove --select vg_uuid=bKYrCn-gMDT-UFAW-XUlX-oRVS-hUxr-ksINed
Do you really want to remove volume group "ubuntu-vg" containing 1 logical volumes? [y/n]: y
  Logical volume ubuntu-vg/ubuntu-lv contains a filesystem in use.

Maybe this helps implementing some details in regard to "potentially the same VG Name"

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

One can make these things unambiguous.

curtin could generate a uuid on the fly, and then pass vg_uuid to all the commands it uses.

or we could start making slightly more unique vg names, i.e. ubuntu-vg-3a185.

I'm not sure what's best.

Indeed this is a long standing issue we have been observing with d-i, ubiquity, curtin, subiquity =)

Even mounting backup of one Ubuntu system, on another leads to similar confusions.

Changed in curtin (Ubuntu):
status: New → Confirmed
Changed in subiquity:
status: New → Confirmed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Confirmed
Changed in ubuntu-power-systems:
status: New → Confirmed
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2021-02-22 08:37 EDT-------
Is this a regression? It's been a long time since I installed Ubuntu, but don't recall any issues when installing multiple instances on a lab machine. Perhaps I never chose LVM, though.

Other distros do ensure that the LVM volume name is unique, for example by appending digits such as "001", "002", "003", ... until it is unique.

tags: added: architecture-ppc64le bugnameltc-191598 severity-high targetmilestone-inin---
Changed in subiquity:
status: Confirmed → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Confirmed → Fix Committed
Changed in ubuntu-power-systems:
status: Confirmed → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
Changed in curtin (Ubuntu):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Changed in ubuntu-z-systems:
assignee: Canonical Foundations Team (canonical-foundations) → Skipper Bug Screeners (skipper-screen-team)
Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

I today received a note from an IBMer who had issues to (re-)install an IBM Z system with various DASD ECKD disks in a LVM configuration.
I was able to recreate the situation on our system and itis as follows:

more than 2 DASDs (used 8 DASD mod3) with:
dasda1 0.75 GB for /boot
dasda2 rest for a lvm vg
all other DASD entirely for that same vg
and a single lv with entire space on top of that vg as root

1) running this installation with pristine DASDs (never low-level formatted before) works fine
2) re-running that installation and only slightly changing the config of dasda (doing a reformat and defining /boot with .5GB, rest for the vg) makes the installer to fail
3) re-running that installation and reformatting ALL disk (and defining them with dasda1 /boot 0.5GB and rest in vg like before) works again

case 2) shows:

 Verifying /dev/dasdc part 1 size, expecting 50027 tracks, found 50027 tracks
 /dev/dasdc partition 1 already present, skipping create
 An error occured handling 'partition-dasdc1': UnboundLocalError - local variable 'part_path' referenced before assignment
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: configuring partition: partition-dasdc1
 TIMED BLOCK_META: 5.790
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: curtin command block-meta
 Traceback (most recent call last):
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/commands/main.py", line 202, in main
     ret = args.func(args)
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/log.py", line 97, in wrapper
     return log_time("TIMED %s: " % msg, func, *args, **kwargs)
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/log.py", line 79, in log_time
     return func(*args, **kwargs)
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/commands/block_meta.py", line 111, in block_meta
     return meta_custom(args)
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/commands/block_meta.py", line 1935, in meta_custom
     handler(command, storage_config_dict)
   File "/snap/subiquity/2364/lib/python3.6/site-packages/curtin/commands/block_meta.py", line 974, in partition_handler
     LOG.debug('Wiping partition %s mode=%s', part_path, wipe_mode)
 UnboundLocalError: local variable 'part_path' referenced before assignment
 local variable 'part_path' referenced before assignment
 curtin: Installation failed with exception: Unexpected error while running command.
 Command: ['curtin', 'block-meta', 'simple']
 Exit code: 3

This msg is different to the one in the bug description here, but I've added this case here, since the situation is very similar.

(importance is probably 'Medium', since a workaround is to reformat all in case of a modified re-install)

And btw. a 21.04 daily was used (timestamp Apr 12th)

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Oh that's yet another new bug, but a pretty simple one: https://code.launchpad.net/~mwhudson/curtin/+git/curtin/+merge/401068

Revision history for this message
Thierry FAUCK (thierry-j) wrote :

I have tested the installation on multipath disk with LVM partition enabled - and it installed without any problem.
However at reboot I got following messages but that could be related to other devices:
  Volume group "ubuntu-vg" not found
  Cannot process volume group ubuntu-vg
....
[FAILED] Failed to start Wait for u…omplete Device Initialization.
[ 657.094223] device-mapper: table: 253:78: multipath: error getting device
[ 658.357618] device-mapper: table: 253:85: multipath: error getting device

and after a while it booted completely and I can check

$ sudo vgscan
[sudo] password for ubuntu:
  Found volume group "rhel_ltc-boston11302" using metadata type lvm2
  Found volume group "ubuntu-vg" using metadata type lvm2
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 53598016 36800 53561216 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 205375464 10896284 183977036 6% /
tmpfs 267989952 0 267989952 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 4096 0 4096 0% /sys/fs/cgroup
/dev/mapper/mpathak-part2 999320 150216 780292 17% /boot
tmpfs 53597952 64 53597888 1% /run/user/1000

$ sudo vgdisplay ubuntu-vg
  --- Volume group ---
  VG Name ubuntu-vg
  System ID
  Format lvm2
  Metadata Areas 1
  Metadata Sequence No 2
  VG Access read/write
  VG Status resizable
  MAX LV 0
  Cur LV 1
  Open LV 1
  Max PV 0
  Cur PV 1
  Act PV 1
  VG Size 9.03 TiB
  PE Size 4.00 MiB
  Total PE 2367485
  Alloc PE / Size 51200 / 200.00 GiB
  Free PE / Size 2316285 / <8.84 TiB
  VG UUID NKRI0V-Vl3E-vWAr-EAPg-EKlz-ViDy-wk4Z5m

~$ sudo multipath
Apr 14 09:35:59 | sdbb: No SAS end device for 'end_device-2:5'
Apr 14 09:35:59 | sddt: No SAS end device for 'end_device-3:5'
Apr 14 09:35:59 | mpathbo: addmap [0 19396558848 multipath 0 0 2 1 service-time 0 1 1 67:80 1 service-time 0 1 1 71:176 1]
[ 1118.078930] device-mapper: table: 253:94: multipath: error getting device
Apr 14 09:35:59 | libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on mpathbo (253:94) failed: Device or resource busy
Apr 14 09:35:59 | dm_addmap: libdm task=0 error: Success
Apr 14 09:35:59 | mpathbo: ignoring map
Apr 14 09:35:59 | sdbb: No SAS end device for 'end_device-2:5'
Apr 14 09:35:59 | sddt: No SAS end device for 'end_device-3:5'
Apr 14 09:35:59 | mpathbo: addmap [0 19396558848 multipath 0 0 2 1 service-time 0 1 1 67:80 1 service-time 0 1 1 71:176 1]
[ 1118.117843] device-mapper: table: 253:94: multipath: error getting device
Apr 14 09:35:59 | libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on mpathbo (253:94) failed: Device or resource busy
Apr 14 09:35:59 | dm_addmap: libdm task=0 error: Success
Apr 14 09:35:59 | mpathbo: ignoring map

Revision history for this message
Thierry FAUCK (thierry-j) wrote :

As such I think system is operational - and bug can be closed - if I missed something let me know

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1905412] Re: LVM install broken if other disks have meta-data on the VG name already

On Wed, 14 Apr 2021 at 21:52, Thierry FAUCK <email address hidden>
wrote:

> I have tested the installation on multipath disk with LVM partition
> enabled - and it installed without any problem.
> However at reboot I got following messages but that could be related to
> other devices:
> Volume group "ubuntu-vg" not found
> Cannot process volume group ubuntu-vg
> ....
> [FAILED] Failed to start Wait for u…omplete Device Initialization.
> [ 657.094223] device-mapper: table: 253:78: multipath: error getting
> device
> [ 658.357618] device-mapper: table: 253:85: multipath: error getting
> device
>

Hm, is this your system with 160+ paths? Maybe something isn't waiting long
enough.

> and after a while it booted completely and I can check

Well if it booted OK it doesn't sound toooo bad :) Can you attach the
journal output? Possibly to a new bug report as I'm pretty sure whatever
is going here is not related to the problem this bug was about.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Changed in subiquity:
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin2004
removed: targetmilestone-inin---
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers