GRUB (re)installation failing due to stale grub-{pc,efi}/install_devices

Bug #1940723 reported by Redouan Benazzi
154
This bug affects 30 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Confirmed
Medium
Mate Kukri

Bug Description

package shim-signed 1.40.6+15.4-0ubuntu7 failed to install/upgrade: installed shim-signed package post-installation script subprocess returned error exit status 32

ProblemType: Package
DistroRelease: Ubuntu 20.04
Package: shim-signed 1.40.6+15.4-0ubuntu7
ProcVersionSignature: Ubuntu 5.11.0-27.29~20.04.1-generic 5.11.22
Uname: Linux 5.11.0-27-generic x86_64
.proc.sys.kernel.moksbstate_disabled: Error: [Errno 2] Aucun fichier ou dossier de ce type: '/proc/sys/kernel/moksbstate_disabled'
ApportVersion: 2.20.11-0ubuntu27.18
Architecture: amd64
BootEFIContents:
 BOOTX64.CSV
 grub.cfg
 grubx64.efi
 mmx64.efi
 shimx64.efi
CasperMD5CheckResult: skip
Date: Sat Aug 21 00:04:46 2021
DuplicateSignature:
 package:shim-signed:1.40.6+15.4-0ubuntu7
 Setting up shim-signed (1.40.6+15.4-0ubuntu7) ...
 mount: /var/lib/grub/esp: le périphérique spécial /dev/disk/by-id/ata-WDC_WD5000AAKX-001CA0_WD-WMAYU5896427-part1 n'existe pas.
 dpkg: error processing package shim-signed (--configure):
  installed shim-signed package post-installation script subprocess returned error exit status 32
EFITables:
 août 20 23:52:39 benazzi-HP-EliteDesk-800-G1-SFF kernel: efi: EFI v2.31 by American Megatrends
 août 20 23:52:39 benazzi-HP-EliteDesk-800-G1-SFF kernel: efi: ACPI=0xd9900000 ACPI 2.0=0xd9900000 SMBIOS=0xd9f7e498
 août 20 23:52:39 benazzi-HP-EliteDesk-800-G1-SFF kernel: secureboot: Secure boot disabled
 août 20 23:52:39 benazzi-HP-EliteDesk-800-G1-SFF kernel: secureboot: Secure boot disabled
ErrorMessage: installed shim-signed package post-installation script subprocess returned error exit status 32
InstallationDate: Installed on 2021-03-08 (165 days ago)
InstallationMedia: Ubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
Python3Details: /usr/bin/python3.8, Python 3.8.10, python3-minimal, 3.8.2-0ubuntu2
PythonDetails: /usr/bin/python2.7, Python 2.7.18, python-is-python2, 2.7.17-4
RelatedPackageVersions:
 dpkg 1.19.7ubuntu3
 apt 2.0.6
SecureBoot: 6 0 0 0 0
ShimDiff: Les fichiers binaires /boot/efi/EFI/ubuntu/shimx64.efi et /usr/lib/shim/shimx64.efi.signed sont différents
SourcePackage: shim-signed
Title: package shim-signed 1.40.6+15.4-0ubuntu7 failed to install/upgrade: installed shim-signed package post-installation script subprocess returned error exit status 32
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Redouan Benazzi (axisysteme) wrote :
tags: removed: need-duplicate-check
Revision history for this message
Steve Langasek (vorlon) wrote : Re: shim-signed fails to install if grub disk config is incorrect

Please show the output of 'debconf-show grub-pc|grep install'.

summary: - package shim-signed 1.40.6+15.4-0ubuntu7 failed to install/upgrade:
- installed shim-signed package post-installation script subprocess
- returned error exit status 32
+ shim-signed fails to install if grub disk config is incorrect
Changed in shim-signed (Ubuntu):
status: New → Incomplete
Changed in grub2 (Ubuntu):
status: New → Incomplete
importance: Undecided → Critical
Revision history for this message
Steve Langasek (vorlon) wrote :

looking at the dpkg history it appears update-manager/aptdaemon was installing the package, so this isn't a problem of a non-interactive upgrade with no debconf frontend available (and if it were, that should result in an exit code of 1, not 32).

So there appears to be a bug in /usr/lib/grub/grub-multi-install, which should prompt to correct the disk list whenever there is an absent disk, and not exit with an error.

The problem appears to be that we do not check the return value of the 'mount' command, and only add disks to the "failed_devices" list if mount succeeds but grub-install fails.

tags: added: fr-1637
Changed in grub2 (Ubuntu):
status: Incomplete → New
Changed in shim-signed (Ubuntu):
status: Incomplete → New
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Changed in shim-signed (Ubuntu):
status: New → Confirmed
Revision history for this message
Anders Bälter (abaelter) wrote :

Is it safe to remove shim-signed?

Revision history for this message
Steve Langasek (vorlon) wrote :

> Is it safe to remove shim-signed?

Certainly not. And it wouldn't fix the problem of grub-multi-install failing.

You can run dpkg-reconfigure grub-pc manually, which should allow you to update the disk configuration.

Revision history for this message
Anders Bälter (abaelter) wrote :

Ok, seemed like some Microsoft thing I don't need. But ok, to bad I have to do that on ~500 vms.

How come this is only an issue on Azure Ubuntu vms?

Revision history for this message
Steve Langasek (vorlon) wrote :

It's not specific to Azure VMs, it's a problem whenever the disk configuration has changed vs what it was when grub was installed. It's unclear why this is happening for you in Azure; I defer to Pat from our Certified Public Cloud team who was investigating this.

Revision history for this message
Pat Viafore (patviafore) wrote :

I have been looking into why this has been happening on Azure Ubuntu VMs, but I have not been able to reproduce:

I'm running:

 az vm create --name bug-test --resource-group canonical-patviafore --location southcentralus --image Canonical:0001-com-ubuntu-server-focal:20_04-lts:20.04.202107200 --size Standard_A2 --admin-username ubuntu --ssh-key-value <ssh keys>

And then doing an apt dist-upgrade.

 I will need some way to reproduce the issue to investigate further why only your Azure instances are affected.

Revision history for this message
Pat Viafore (patviafore) wrote :

do you mind providing your grub configuration? (I know you said it was system default in the other bug), but I'd like to compare it against our fresh VM boot just to see if anything was awry.

Revision history for this message
Anders Bälter (abaelter) wrote :

We are only seeing it on Azure, but other cloud providers Ubuntu images don't come with shim-signed so that's why I though we could remove it.

lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 55.5M 1 loop /snap/core18/2074
loop2 7:2 0 32.3M 1 loop /snap/snapd/12398
sda 8:0 0 4G 0 disk /var/lib/rabbitmq
sdb 8:16 0 30G 0 disk
├─sdb1 8:17 0 29.9G 0 part /
├─sdb14 8:30 0 4M 0 part
└─sdb15 8:31 0 106M 0 part /boot/efi
sdc 8:32 0 4G 0 disk
└─sdc1 8:33 0 4G 0 part /mnt
sr0 11:0 1 638K 0 rom
zram0 252:0 0 229.2M 0 disk [SWAP]

/etc/fstab
# CLOUD_IMG: This file was created/modified by the Cloud Image build process
UUID=7339cdbb-1045-46fc-99df-ed81a4d0b313 / ext4 defaults,discard 0 1
UUID=BD61-C33D /boot/efi vfat umask=0077 0 1
/dev/disk/cloud/azure_resource-part1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2
UUID=face62a9-e9d1-40a2-b94b-d18497d7423d /var/lib/rabbitmq xfs defaults,noatime 0 0

/etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX=""

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

Revision history for this message
Jason C. McDonald (codemouse92) wrote :

Hi @abaelter, I'm from the Canonical Public Cloud team. I'll be looking into this further.

Can you tell me what Azure VM size(s) you've observed this occurring on?

Revision history for this message
Anders Bälter (abaelter) wrote :

Great!

Standard_B1ms
Standard_B1s
Standard_D16as_v4
Standard_D2as_v4
Standard_D8as_v4

Revision history for this message
Jason C. McDonald (codemouse92) wrote :

Are you using packer or anything else to modify the image? How are you modifying the disk layout?

I'll double check against some of the provided VM sizes, but I don't think this is a problem with the image. It's likely a configuration issue...which is *not* to say it's not a bug.

Revision history for this message
Anders Bälter (abaelter) wrote :

We are using the "rest" api. Storage profiles look something like:

 {
  "imageReference": {
    "publisher": "canonical",
    "offer": "0001-com-ubuntu-server-focal",
    "sku": "20_04-lts",
    "version": "latest"
  },
  "osDisk": {
    "name": "yadayada-os",
    "managedDisk": {
      "storageAccountType": "Standard_LRS"
    },
    "caching": "ReadWrite",
    "createOption": "fromImage",
  },
  "dataDisks": [
    {
      "name": "yadayada-data1",
      "diskSizeGB": 8,
      "lun": 0,
      "managedDisk": {
        "storageAccountType": "StandardSSD_LRS"
      },
      "caching": "ReadOnly",
      "createOption": "empty"
    }
  ]
}

Data disk storageAccountType can also be "Premium_LRS", Disk size varies up between 4, 8, 128, 256, 512 and 1024.

We are configuring the data disk like this:
mkfs.xfs -f /dev/sdc
mkdir -p /var/lib/rabbitmq
uuid=$(blkid -s UUID -o value /dev/sdc)
echo "UUID=$uuid /var/lib/rabbitmq xfs defaults,noatime 0 0" >> /etc/fstab
mount -a

Revision history for this message
Jason C. McDonald (codemouse92) wrote :

Did you remove or reconfigure any data disks before this bug appeared?

Revision history for this message
Anders Bälter (abaelter) wrote :

No

Revision history for this message
Anders Bälter (abaelter) wrote :

More context:

cat /boot/grub/device.map
cat: /boot/grub/device.map: No such file or directory

sudo debconf-show grub-pc | grep grub-pc/install_devices
  grub-pc/install_devices_failed: false
  grub-pc/install_devices_failed_upgrade: true
* grub-pc/install_devices_empty: false
  grub-pc/install_devices_disks_changed:
* grub-pc/install_devices: /dev/disk/by-id/scsi-14d534654202020206b732779d9c8429f997bc9f57f0b9212

ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 root root 9 Aug 18 11:36 ata-Virtual_CD -> ../../sr0
lrwxrwxrwx 1 root root 9 Aug 18 11:36 scsi-3600224800de237bbe8e76773044489a1 -> ../../sdc
lrwxrwxrwx 1 root root 9 Aug 18 11:36 scsi-360022480400bb6ba207caad0981cbca9 -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 18 11:36 scsi-360022480400bb6ba207caad0981cbca9-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 9 Aug 18 11:36 scsi-3600224806b732779d9c8c9f57f0b9212 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 18 11:36 scsi-3600224806b732779d9c8c9f57f0b9212-part1 -> ../../sda1
lrwxrwxrwx 1 root root 11 Aug 18 11:36 scsi-3600224806b732779d9c8c9f57f0b9212-part14 -> ../../sda14
lrwxrwxrwx 1 root root 11 Aug 18 11:36 scsi-3600224806b732779d9c8c9f57f0b9212-part15 -> ../../sda15
lrwxrwxrwx 1 root root 9 Aug 18 11:36 wwn-0x600224800de237bbe8e76773044489a1 -> ../../sdc
lrwxrwxrwx 1 root root 9 Aug 18 11:36 wwn-0x60022480400bb6ba207caad0981cbca9 -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 18 11:36 wwn-0x60022480400bb6ba207caad0981cbca9-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 9 Aug 18 11:36 wwn-0x600224806b732779d9c8c9f57f0b9212 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 18 11:36 wwn-0x600224806b732779d9c8c9f57f0b9212-part1 -> ../../sda1
lrwxrwxrwx 1 root root 11 Aug 18 11:36 wwn-0x600224806b732779d9c8c9f57f0b9212-part14 -> ../../sda14
lrwxrwxrwx 1 root root 11 Aug 18 11:36 wwn-0x600224806b732779d9c8c9f57f0b9212-part15 -> ../../sda15

Revision history for this message
Marc Landtwing (marland123) wrote :

We seem to be hitting the same issue, also on Azure.

It happens when we deploy a VM from a Shared Image gallery and change the OS disk size during deployment. We use a template spec to deploy the VM. If we set diskSizeGB in the storageProfile, the issue appears.

mount: /var/lib/grub/esp: special device /dev/disk/by-id/scsi-14d534654202020203d0cbbcd0bb74669a97057319f6cd140-part15 does not exist.

ls -al /dev/disk/by-id/ | grep sda15
lrwxrwxrwx 1 root root 11 Sep 15 10:23 scsi-14d53465420202020ad9d668420ceb0419516df3a7418668d-part15 -> ../../sda15
lrwxrwxrwx 1 root root 11 Sep 15 10:23 scsi-360022480ad9d668420cedf3a7418668d-part15 -> ../../sda15
lrwxrwxrwx 1 root root 11 Sep 15 10:23 wwn-0x60022480ad9d668420cedf3a7418668d-part15 -> ../../sda15

I does not happen when we keep the original OS size from the image:
ls -al /dev/disk/by-id/ | grep sda15
lrwxrwxrwx 1 root root 11 Sep 14 09:04 scsi-14d534654202020203d0cbbcd0bb74669a97057319f6cd140-part15 -> ../../sda15
lrwxrwxrwx 1 root root 11 Sep 14 09:04 scsi-3600224803d0cbbcd0bb757319f6cd140-part15 -> ../../sda15
lrwxrwxrwx 1 root root 11 Sep 14 09:04 wwn-0x600224803d0cbbcd0bb757319f6cd140-part15 -> ../../sda15

Revision history for this message
Anders Bälter (abaelter) wrote :

We "solved" it by reinstalling grub.

I.e.

export DEBIAN_FRONTEND=noninteractive

apt-get purge grub\* -y --allow-remove-essential
apt-get install grub-efi -y
apt-get autoremove -y
update-grub

Revision history for this message
Marc Landtwing (marland123) wrote (last edit ):

This should really be fixed properly and officially.

It's an issue in the default Azure image and in the way Azure deploys a VM with a bigger root disk size than the image and therefore it should be looked into by whoever is in charge of supporting Ubuntu on Azure.

Steve Langasek (vorlon)
tags: added: foundations-todo
Mate Kukri (mkukri)
Changed in grub2 (Ubuntu):
assignee: nobody → Mate Kukri (mkukri)
Mate Kukri (mkukri)
no longer affects: shim-signed (Ubuntu)
summary: - shim-signed fails to install if grub disk config is incorrect
+ GRUB (re)installation failing due to stale grub-{pc,efi}/install_devices
Revision history for this message
Mate Kukri (mkukri) wrote :

As far as we know these issue are all caused by the install_devices debconf entry becoming invalid due to hardware configuration changes, and then a subsequent upgrade in non-interactive mode failing due to this.

Interactive upgrade or re-installation will simply prompt to specify the new install device path.

A solution to make this nicer in the future is to use UUID based install_devices tracking, but I consider this a new a feature instead, and lowering priority to Medium

Changed in grub2 (Ubuntu):
importance: Critical → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.