Wallaby c8 and c9 OVB jobs are failing the modify image step - mount point does not exist

Bug #1993730 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

Wallaby c8 and c9 OVB jobs are failing the modify image step with the following error:

2022-10-20 14:21:25.169304 | primary | fatal: [undercloud]: FAILED! => {
2022-10-20 14:21:25.169342 | primary | "msg": {
2022-10-20 14:21:25.169350 | primary | "stderr": "+ type tripleo-mount-image\n+ tripleo-mount-image -a /home/zuul/overcloud-hardened-uefi-full.raw -m /tmp/tmp.H48GWoVkkC\n+ qemu-img info --output json /home/zuul/overcloud-hardened-uefi-full.raw\n+ grep '\"format\": \"raw\"'\n+ image_format='--format raw'\n+ qemu-nbd --format raw --connect /dev/nbd0 /home/zuul/overcloud-hardened-uefi-full.raw\n+ vgscan\n+ vgchange --refresh\n+ vgchange -ay\n+ '[' -b /dev/nbd0p3 ']'\n+ '[' -b /dev/mapper/vg-lv_root ']'\n+ mount /dev/mapper/vg-lv_root /tmp/tmp.H48GWoVkkC\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_var\n+ path=/var\n+ mount_volume /dev/mapper/vg-lv_var /tmp/tmp.H48GWoVkkC/var\n+ '[' -b /dev/mapper/vg-lv_var ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/var ']'\n+ mount /dev/mapper/vg-lv_var /tmp/tmp.H48GWoVkkC/var\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_log\n+ path=/var/log\n+ mount_volume /dev/mapper/vg-lv_log /tmp/tmp.H48GWoVkkC/var/log\n+ '[' -b /dev/mapper/vg-lv_log ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/var/log ']'\n+ mount /dev/mapper/vg-lv_log /tmp/tmp.H48GWoVkkC/var/log\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_audit\n+ path=/var/log/audit\n+ mount_volume /dev/mapper/vg-lv_audit /tmp/tmp.H48GWoVkkC/var/log/audit\n+ '[' -b /dev/mapper/vg-lv_audit ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/var/log/audit ']'\n+ mount /dev/mapper/vg-lv_audit /tmp/tmp.H48GWoVkkC/var/log/audit\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_home\n+ path=/home\n+ mount_volume /dev/mapper/vg-lv_home /tmp/tmp.H48GWoVkkC/home\n+ '[' -b /dev/mapper/vg-lv_home ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/home ']'\n+ mount /dev/mapper/vg-lv_home /tmp/tmp.H48GWoVkkC/home\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_tmp\n+ path=/tmp\n+ mount_volume /dev/mapper/vg-lv_tmp /tmp/tmp.H48GWoVkkC/tmp\n+ '[' -b /dev/mapper/vg-lv_tmp ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/tmp ']'\n+ mount /dev/mapper/vg-lv_tmp /tmp/tmp.H48GWoVkkC/tmp\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_srv\n+ path=/srv\n+ mount_volume /dev/mapper/vg-lv_srv /tmp/tmp.H48GWoVkkC/srv\n+ '[' -b /dev/mapper/vg-lv_srv ']'\n+ '[' '!' -d /tmp/tmp.H48GWoVkkC/srv ']'\n+ mount /dev/mapper/vg-lv_srv /tmp/tmp.H48GWoVkkC/srv\n+ blkid -t PARTLABEL=ESP /dev/nbd0p1\n+ mount /dev/nbd0p1 /tmp/tmp.H48GWoVkkC/boot/efi\nmount: /tmp/tmp.H48GWoVkkC/boot/efi: mount point does not exist.",
2022-10-20 14:21:25.169405 | primary | "stdout": " \"format\": \"raw\",\n Found volume group \"vg\" using metadata type lvm2\n 8 logical volume(s) in volume group \"vg\" now active\n/dev/nbd0p1: SEC_TYPE=\"msdos\" LABEL=\"MKFS_ESP\" UUID=\"4E18-3B23\" BLOCK_SIZE=\"512\" TYPE=\"vfat\" PARTLABEL=\"ESP\" PARTUUID=\"e01da6ac-34a9-4ac3-8e8a-c98afb8f577f\""

Example logs:

https://logserver.rdoproject.org/openstack-periodic-integration-stable1-cs8/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/2bcd4ce/job-output.txt

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby/d6e9128/job-output.txt

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-wallaby/15798ef/job-output.txt

Ronelle Landy (rlandy)
Changed in tripleo:
milestone: none → antelope-1
importance: Undecided → Critical
status: New → Triaged
tags: added: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Steve Baker (steve-stevebaker) wrote :

The wallaby-bootpart series actually had a dependency on this series which makes tripleo-mount-image more robust:

https://review.opendev.org/q/topic:wallaby-partitions

Could we please consider reviewing this series instead of backporting? I'd expect this change[1] to fix this particular issue in featureset001

Revision history for this message
Steve Baker (steve-stevebaker) wrote :

s/backporting/reverting/

Revision history for this message
Steve Baker (steve-stevebaker) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :

for clarity

(fyi @sbaker launchpad now supports edit and delete for comments)

This bug was caused by the merges in wallaby of topic:wallaby-bootpart [1]

Instead of reverting that we are going to try and merge topic:wallaby-partitions [2] for the fix

[1] https://review.opendev.org/q/topic:wallaby-bootpart
[2] https://review.opendev.org/q/topic:wallaby-partitions

Revision history for this message
Marios Andreou (marios-b) wrote :

patches merged at https://review.opendev.org/q/topic:wallaby-partitions

we have had a green run on fs1 for wallaby/centos9 at [1] but not for wallaby/centos8 yet.

The last run for the centos8/wallaby [2] line failed with

  2022-10-23 15:26:38.655471 | primary | TASK [modify-image : Debug image mount] ****************************************
  2022-10-23 15:26:38.655528 | primary | Sunday 23 October 2022 15:26:38 +0000 (0:00:01.110) 0:10:10.179 ********
  2022-10-23 15:26:38.695437 | primary | fatal: [undercloud]: FAILED! => {
  2022-10-23 15:26:38.695478 | primary | "msg": {
  2022-10-23 15:26:38.695488 | primary | "stderr": "+ type tripleo-mount-image\n+ tripleo-mount-image -a /home/zuul/overcloud-hardened-uefi-full.raw -m /tmp/tmp.prlglYUZEk\nlsblk: unknown column: PTTYPE",
  2022-10-23 15:26:38.695496 | primary | "stdout": "/tmp/tmp.prlglYUZEk is not a mountpoint"
  2022-10-23 15:26:38.695503 | primary | }

Not sure yet if that is related to the issue being tracked here (or any of the fixes merged)

[1] https://review.rdoproject.org/zuul/build/1693650e7c424121a7645b4e5a10ce82
[2] https://logserver.rdoproject.org/openstack-periodic-integration-stable1-cs8/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/848e230/job-output.txt

Revision history for this message
Marios Andreou (marios-b) wrote :

we have more patches for the new issue seen in comment #6

Fix issue in extract image https://review.opendev.org/c/openstack/diskimage-builder/+/850882

tripleo-mount-image: replace most lsblk calls with blkid https://review.opendev.org/c/openstack/tripleo-common/+/862249

Testing those with https://review.rdoproject.org/r/c/testproject/+/45812

Revision history for this message
Marios Andreou (marios-b) wrote :

test looks good @ https://review.rdoproject.org/r/c/testproject/+/45812/3#message-185e1c5dd95f155e0eb920101d07f3137a7f4ac9

we need those patches to merge into master & wallaby

Revision history for this message
Alan Pevec (apevec) wrote :

> we need those patches to merge into master & wallaby

@marios DIB is master only and we are not pinning it in Wallaby rdoinfo, so we should be good now that https://review.opendev.org/c/openstack/diskimage-builder/+/850882 was merged, can you verify?

Revision history for this message
Amol Kahat (amolkahat) wrote :
Download full text (9.5 KiB)

I could see that modify-image is using blkid. Verifying bug. Marking this as Fix Released.

```
2022-12-19 17:10:35.654587 | primary | "stderr": "+ type tripleo-mount-image\n+ tripleo-mount-image -a /home/zuul/overcloud-hardened-uefi-full.raw -m /tmp/tmp.hB0FyDCaX9\n+ qemu-img info --output json /home/zuul/overcloud-hardened-uefi-full.raw\n+ grep '\"format\": \"raw\"'\n+ image_format='--format raw'\n+ qemu-nbd --format raw --connect /dev/nbd0 /home/zuul/overcloud-hardened-uefi-full.raw\n+ vgscan\n+ vgchange --refresh\n+ vgchange -ay\n+ root_device=\n+ boot_device=\n+ efi_device=\n+ timeout 5 sh -c 'while ! ls /dev/nbd0p* ; do sleep 1; done'\n+ set +e\n++ ls /dev/nbd0p1 /dev/nbd0p2 /dev/nbd0p3 /dev/nbd0p4\n+ devices='/dev/nbd0p1\n/dev/nbd0p2\n/dev/nbd0p3\n/dev/nbd0p4'\n+ set -e\n++ echo /dev/nbd0p1 /dev/nbd0p2 /dev/nbd0p3 /dev/nbd0p4\n++ wc -w\n+ device_count=4\n+ '[' 4 == 0 ']'\n+ '[' 4 == 1 ']'\n+ for device in ${devices}\n+ lsblk --nodeps -P --output-all /dev/nbd0p1\n++ blkid -o value -s TYPE -p /dev/nbd0p1\n+ fstype=vfat\n++ blkid -o value -s LABEL -p /dev/nbd0p1\n+ label=MKFS_ESP\n++ lsblk --all --nodeps --noheadings --output PARTTYPENAME /dev/nbd0p1\n+ part_type_name='EFI System'\n++ blkid -o value -s PART_ENTRY_TYPE -p /dev/nbd0p1\n+ part_type=c12a7328-f81f-11d2-ba4b-00a0c93ec93b\n+ '[' -z vfat ']'\n+ '[' -z '' ']'\n+ [[ c12a7328-f81f-11d2-ba4b-00a0c93ec93b == c12a7328-f81f-11d2-ba4b-00a0c93ec93b ]]\n+ efi_device=/dev/nbd0p1\n+ continue\n+ for device in ${devices}\n+ lsblk --nodeps -P --output-all /dev/nbd0p2\n++ blkid -o value -s TYPE -p /dev/nbd0p2\n+ fstype=\n++ blkid -o value -s LABEL -p /dev/nbd0p2\n+ label=\n++ lsblk --all --nodeps --noheadings --output PARTTYPENAME /dev/nbd0p2\n+ part_type_name='BIOS boot'\n++ blkid -o value -s PART_ENTRY_TYPE -p /dev/nbd0p2\n+ part_type=21686148-6449-6e6f-744e-656564454649\n+ '[' -z '' ']'\n+ continue\n+ for device in ${devices}\n+ lsblk --nodeps -P --output-all /dev/nbd0p3\n++ blkid -o value -s TYPE -p /dev/nbd0p3\n+ fstype=ext4\n++ blkid -o value -s LABEL -p /dev/nbd0p3\n+ label=mkfs_boot\n++ lsblk --all --nodeps --noheadings --output PARTTYPENAME /dev/nbd0p3\n+ part_type_name='Linux extended boot'\n++ blkid -o value -s PART_ENTRY_TYPE -p /dev/nbd0p3\n+ part_type=bc13c2ff-59e6-4262-a352-b275fd6f7172\n+ '[' -z ext4 ']'\n+ '[' -z /dev/nbd0p1 ']'\n+ '[' -z '' ']'\n+ [[ bc13c2ff-59e6-4262-a352-b275fd6f7172 == bc13c2ff-59e6-4262-a352-b275fd6f7172 ]]\n+ boot_device=/dev/nbd0p3\n+ continue\n+ for device in ${devices}\n+ lsblk --nodeps -P --output-all /dev/nbd0p4\n++ blkid -o value -s TYPE -p /dev/nbd0p4\n+ fstype=LVM2_member\n++ blkid -o value -s LABEL -p /dev/nbd0p4\n+ label=\n++ lsblk --all --nodeps --noheadings --output PARTTYPENAME /dev/nbd0p4\n+ part_type_name='Linux filesystem'\n++ blkid -o value -s PART_ENTRY_TYPE -p /dev/nbd0p4\n+ part_type=0fc63daf-8483-4772-8e79-3d69d8477de4\n+ '[' -z LVM2_member ']'\n+ '[' -z /dev/nbd0p1 ']'\n+ '[' -z /dev/nbd0p3 ']'\n+ '[' -z '' ']'\n+ root_device=/dev/nbd0p4\n+ continue\n+ '[' -z /dev/nbd0p4 ']'\n+ '[' -b /dev/mapper/vg-lv_root ']'\n+ mount /dev/mapper/vg-lv_root /tmp/tmp.hB0FyDCaX9\n+ for m in $MOUNTS\n+ device=/dev/mapper/vg-lv_var\n+ path=/var...

Read more...

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.