Ubuntu

grub-pc.postinst script fails to detect virtio vda disk in KVM guest

Reported by nutznboltz on 2010-07-11
130
This bug affects 22 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
High
Unassigned

Bug Description

A statement explaining the impact of the bug on users and justification for backporting the fix to the stable release: Multiple people report they cannot install grub2 on virtio disk on Ubuntu 10.04 LTS

An explanation of how the bug has been addressed in the development branch, including the relevant version numbers of packages modified in order to implement the fix.: Fixed (at least) in udev 167-0ubuntu3 in Natty 11.04

A minimal patch applicable to the stable version of the package. If preparing a patch is likely to be time-consuming, it may be preferable to get a general approval from the SRU team first.: see provided patch

Detailed instructions how to reproduce the bug. These should allow someone who is not familiar with the affected package to reproduce the bug and verify that the updated package fixes the problem. Please mark this with a line "TEST CASE:": on KVM/qemu with /dev/vda only try running "dpkg-reconfigure grub-pc" the scripts in that package will fail because /dev/disk/by-id does not exist due to missing udev rules.

A discussion of the regression potential of the patch and how users could get inadvertently affected.: only two additional lines of udev rules and only affects /dev/vd* devices.

Binary package hint: grub-pc

The grub-pc.postinst script fails to detect virtio "vda" disk in a KVM guest because it looks for entries in /dev/disk/by-id/* and there is absolutely no /dev/disk/by-id directory at all when the only disk is a vda virtio one.

You can test this either by creating a new KVM VM guest running Ubuntu 10.04 with only virtio vda disk or by converting an existing hda disk VM guest into a vda one and then trying to purge and re-install grub-pc.

If the KVM VM guest running Ubuntu 10.04 is configured to boot from LVM this issue is not seen; the failure requires a guest with a bare /dev/vda disk.

Applying the patch (attached) to grub-pc.postinst in the KVM VM guest allows you to install (or re-install) grub-pc; without it the grub-pc package will not configure.

Patch is against grub-pc 1.98-1ubuntu6

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: grub-pc 1.98-1ubuntu6
ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic i686
NonfreeKernelModules: nvidia
Architecture: i386
Date: Sun Jul 11 12:15:58 2010
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: grub2

Colin Watson (cjwatson) wrote :

I would like to fix this somehow, although it seems to me that it's also a udev bug that it doesn't provide by-id links for virtio disks. However, using by-uuid is definitely wrong; by-uuid links only identify filesystems, and the most important entries in this context are those for the top-level disk device which does not typically contain a filesystem.

The patch I provided here is not meant for production use; it merely illustrates the bug.

Ah, fair enough then. On my list ...

 status triaged
 importance high

Changed in grub2 (Ubuntu):
importance: Undecided → High
status: New → Triaged

The same situation for xen guests. No by-id things, just by-label, by-path and by-uuid.

Jan Jonas (jj-learnbit) wrote :

As written in https://bugs.launchpad.net/ubuntu/+bug/524434 I had the same problem with an virtualized Ubuntu 10.04 under XenServer 5.6: The grub installer does not find the virtual hard disks /dev/xvdX.

Mark - Syminet (mark-syminet) wrote :

Same here - switched a few kvm guests from ide -> virtio disks. Ran apt-get upgrade and voila, broken packaging system. The patch provided by nutznboltz up there solved the issue.

Luís Silva (luis) wrote :

Same here, but on xen...

Luís Silva (luis) wrote :

However, reading comment 3, I didn't try the patch. Also, no by-id... just by-path and by-uuid...

James Stevens (jstevens) wrote :

Has this bug been addressed yet ???

We run 3 XenServers and we strictly run Ubuntu for the Linux side quests. I just tried to upgrade one of our name servers today and caught this bug. We have 39 active quests running Ubuntu.. This could prove to be problematic unless it has been patched and apt-get update just is not catching installing the patched version.

James Stevens (jstevens) on 2010-09-18
description: updated
Ronnie Jespersen (rj-tabulex) wrote :

Confirmed on xen...

Distributor ID: Ubuntu
Description: Ubuntu 10.04.1 LTS
Release: 10.04
Codename: lucid

Linux version 2.6.32-24-server (buildd@yellow) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) )

My current work-around for this is to build the KVM guest using grub2 and LVM encapsulation. For some reason grub2 has no issues installing when using that configuration.

1. Build using Ubuntu 9.10 or later server ISO
2. Select manually disk partitioning
3. Create a single primary partition for the entire disk
4. Configure that partition for LVM
5. Configure LVM to have a single vg (vg0) with one swap LV and one root filesystem LV.

One more thing: in order to boot from LVM you must use the linux-server kernel package as the linux-virtual one lacks an initrd with LVM modules.

Mark - Syminet (mark-syminet) wrote :

I can't believe you guys actually blackholed stdout during fsck... I updated the code for this to provide for a status bar, like it should be with "e2fsck -C0 -y" as its been since like, forever:

https://launchpad.net/~mark-syminet/+archive/syminet

...not applying this to any machine, for obvious reasons but would like to see this option in /etc/default/grub for reasons which should be obvious. The diffs are simple two-liners.

@Mark It is very important to realize that you can't submit bugs or comments in any arbitrary format to launchpad and expect them to be recognized. Take some time to learn how launchpad works as input which is not very precisely formatted gets ignored.
Some videos to watch:
http://www.youtube.com/CanonicalLaunchpad

@Mark if you really want to submit a real patch, learn how to validate that the change is upstream and how to generate a debdiff and attach it here.

Appears there are four different udev disk directories not three.

/dev/disk/by-id

/dev/disk/by-label

/dev/disk/by-path

/dev/disk/by-uuid

If you have only virtio disk (/dev/vda) then you will not have /dev/disk/by-id

If you have only virtio disk (/dev/vda) plus a disk label you will have /dev/disk/by-label and that is good enough for grub-pc wrt. installing/updating.

If you have only virtio disk (/dev/vda) then you will not have /dev/disk/by-id

should say:

If you have only virtio disk (/dev/vda) then you will not have /dev/disk/by-id unless you have an LVM vg.

Looking at it closer I don't think /dev/disk/by-label makes a difference.

It is as if there are at least four committees: grub2, kvm/virtio, udev and Canonical who need to have a meeting to discuss this.

/dev/disk/by-id is created by udev rules.

This is really a udev bug.

This is really a udev bug that was fixed (at least) in Natty.

udev 167-0ubuntu3 has

# virtio-blk
KERNEL=="vd*[!0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}"
KERNEL=="vd*[0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}-part%n"

in 60-persistent-storage.rules

LP: #307845

marked invalid.

SRU me.

description: updated
description: updated

TEST CASE:

Environment:

VM with virtio boot disk /dev/vda partition (like /dev/vda1) mounted on / possibly also a partition (like /dev/vda1) on mounted on /boot and no /dev/disk/by-id directory (because of missing udev rules.)

Example Environment:

$ ls /dev/[shv]d*
/dev/vda /dev/vda1 /dev/vda2
$ ls -ld /dev/disk/by-id
ls: cannot access /dev/disk/by-id: No such file or directory

Run:

sudo dpkg-reconfigure grub-pc

If you get these error messages:

``You chose not to install GRUB to any devices. If you continue, the boot loader may not be properly configured, and when your computer next starts up it will use whatever was previously in the boot sector. If there is an earlier version of GRUB 2 in the boot sector, it may be unable to load modules or handle the current configuration file.

If you are already running a different boot loader and want to carry on doing so, or if this is a special environment where you do not need a boot loader, then you should continue anyway. Otherwise, you should install GRUB somewhere.

Continue without installing GRUB?''

Then the bug this ticket (LP: #604335) is about is present.

If you get

$ ls -ld /dev/disk/by-id
drwxr-xr-x 2 root root 120 2011-06-09 16:00 /dev/disk/by-id

Plus when you run "sudo dpkg-reconfigure grub-pc" you see /dev/vda and /dev/vda1 offered as targets for installing grub2 into and do not get the above error messages then you do not have the bug.

This requires modifying the kernel.

http://lists.gnu.org/archive/html/qemu-devel/2010-06/msg02502.html
 ``[Qemu-devel] [PATCH 1/2] Add 'serial' attribute to virtio-blk devices''

The refusal to SRU the original ("grub-pc.postinst.udiff") patch means that kernel changes and udev changes are needed for a bug that was fixed in 10.10.

The grub-install package does not look for /dev/disk/by-id

A peek inside the bash script in that package named /usr/bin/grub-installer shows it just has code to match "/dev/[hsv]d[a-z]".

PPA with "grub-pc.postinst.udiff" patch integrated since it's less intrusive than kernel patches + udev patches and the issue was fixed in 10.10 here:

https://launchpad.net/~nutznboltz/+archive/working-grub2-on-virtio-disk-for-lts

Patch attachment is a debdiff against the latest proposed grub2-pc package plus the postinstall script hack.

I'm not done working on the patch but it's better to have this than to only have an LTS with a broken grub-pc package in "uF" state.

With that PPA you can run "sudo dpkg-reconfigure grub-pc" on virtio block disk but you get a warning message about swap partitions (e.g. /dev/vda1). You can click "Yes" to continue past it and the package status is still "ii" not "uF" or anything indicating configuration failed.

As to why it takes virtio block driver changes for /dev/disk/by-id to work compare

sudo sg_vpd --page=0x80 /dev/sda

with

sudo sg_vpd --page=0x80 /dev/vda

$ lsb_release -sd
Ubuntu 11.04
$ ls -l /sys/block/vda/serial
-r--r--r-- 1 root root 4096 2011-06-16 13:22 /sys/block/vda/serial
$ sudo cat /sys/block/vda/serial
$

That file should contain a serial number but it doesn't.

Same thing on Fedora (by way of the dm-devel mailing list)

http://www.redhat.com/archives/dm-devel/2010-September/msg00095.html

Follow up message has the quote "The qemu on the host isn't new enough to handle the request."
http://www.redhat.com/archives/dm-devel/2010-September/msg00145.html

Re: qemu newness. VIRTIO_BLK_T_GET_ID only appears twice inside the kernel and one of the times is the #define.

This is because VIRTIO_BLK_T_GET_ID is acted upon in qemu hw/virtio-blk.c

So in order to SRU you will need to SRU linux (kernel) virtio_blk.c changes, udev changes and qemu changes.

This has not been fixed in 10.10 or 11.04 either (despite my prior claims.)

from qemu git:

commit 2930b313dd602d67a568815b0b031b824916cec9
Author: john cooper
Date: Fri Jul 2 13:44:25 2010 -0400

    Add virtio disk identification support

    This patch adds the final missing bits for support of
    passing a serial/id string to a virtio-blk guest driver.

    The guest-side component already exists in the virtio
    driver, and has recently been reworked by Ryan to export
    a /sys interface for retrieval of the id from guest userland.

    Signed-off-by: john cooper
    Signed-off-by: Kevin Wolf

Still not fixed on 11.10

nutznboltz@hanuman:~$ lsb_release -sd
Ubuntu oneiric (development branch)
nutznboltz@hanuman:~$ uname -a
Linux hanuman 3.0-0-server #1-Ubuntu SMP Thu Jun 9 16:50:35 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
nutznboltz@hanuman:~$ sudo cat /sys/block/vda/serial
nutznboltz@hanuman:~$

/dev/vda symlinks are also not under /dev/disk/by-id

/sys/block/vda/serial should not be an empty file.

Heh, just thought "what if you have to set the serial number from the command line". Then I went to check and

nutznboltz@lakshmi:~$ qemu --help | egrep 'drive|serial' | head -2
-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]
       [,serial=s][,addr=A][,id=name][,aio=threads|native]

libvirt wiki documents support for virtio disk serial numbers
http://libvirt.org/formatdomain.html#elementsDisks
``serial
    If present, this specify serial number of virtual hard drive. For example, it may look as <serial>WD-WMAP9A966149</serial>. Since 0.7.1 ''

virt-manager 0.8.7 (on 11.10) does not support virtio block serial numbers. If you even provide a serial number virt-manger will not start the VM guest.

Meanwhile I've confirmed that if you just edit libvirt with "virsh edit ..." to insert <serial>FOO</serial> and boot the VM with "virsh start ..." the /dev/disk/by-id of 11.04 guest on 11.04 host works:

nutznboltz@grub-bug:~$ lsb_release -ds
Ubuntu 11.04
nutznboltz@grub-bug:~$ uname -a
Linux grub-bug 2.6.38-8-virtual #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
nutznboltz@grub-bug:~$ ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 root root 9 2011-06-17 11:04 ata-QEMU_DVD-ROM_QM00003 -> ../../sr0
lrwxrwxrwx 1 root root 9 2011-06-17 11:04 virtio-WD-NUTZ9A966149 -> ../../vda
lrwxrwxrwx 1 root root 10 2011-06-17 11:04 virtio-WD-NUTZ9A966149-part1 -> ../../vda1
lrwxrwxrwx 1 root root 10 2011-06-17 11:04 virtio-WD-NUTZ9A966149-part2 -> ../../vda2

Only Natty and Oneiric are working and only with the caveat that qemu must be invoked with options to set the serial number of the virtio block devices.

Lucid will never be fixed correctly. The amount of changes are too huge. I went back to working on script.

New version of PPA with script hack has been uploaded plus new debdiff attached.

$ lsb_release -sd
Ubuntu 10.04.2 LTS

$ sudo apt-get safe-upgrade
... skip ...
Setting up grub-pc (1.98-1ubuntu12.1~ppa2) ...
Installation finished. No error reported.
... skip ...

$ sudo dpkg-reconfigure grub-pc
Installation finished. No error reported.
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-2.6.32-32-server
Found initrd image: /boot/initrd.img-2.6.32-32-server
Found memtest86+ image: /boot/memtest86+.bin
done

Could 1.98-1ubuntu12.1~ppa2 be absolutely flawless?

1.98-1ubuntu12.1~ppa4 is more through about insisting on using /dev/vda devices. The previous versions would only try if /dev/by-id was completely missing; this one looks for all the /dev/vd* devices it can find.

FWIW VMWare Fusion Version 3.1.3 (416484)

$ lsb_release -ds
Ubuntu 10.04.2 LTS

$ sudo sg_vpd --page=0x80 /dev/sda
Unit serial number VPD page:
fetching VPD page failed

$ lsmod | grep vm
vmblock 12995 1
vmmemctl 8572 0
vmci 31256 1 vsock
vmxnet 18624 0

I was told that SCSI serial numbers are an optional feature. Is grub-pc designed to require an optional feature?

tags: added: testcase
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers