xend fails to connect guest to dom0 block device or loopback file

Bug #205450 reported by Andrew Shugg on 2008-03-23
24
Affects Status Importance Assigned to Milestone
xen-3.2 (Ubuntu)
Undecided
Unassigned
xen-tools (Ubuntu)
Undecided
Unassigned

Bug Description

I have been having troubles with Xen simply being unable to run paravirt domUs since I built a gutsy server a while ago. When booting a domU it would bomb out to the initrd busybox shell because it could not find its root device. (The same config works fine on an Ubuntu 7.04 Xen server.) Used dist-upgrade to track hardy, trying xen-3.1 and then xen-3.2 packges. After a while I blamed the problems on EVMS because I was using EVMS volumes for the root and swap devices for domUs. However then after trying the loopback file method I couldn't get the domU to start either. The error message is the same that is posted in another bug report:

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/199533/comments/31

So, today I put in a new system disk into this problem server and installed from scratch using hardy i386 alpha 2008-03-16 CD. Then apt-get update to current (2008-03-23) packages. Installed ubuntu-xen-server, rebooted to Xen dom0 kernel. System is an oldPentium 4, no HVM, no x86_64; single IDE 80Gb hard drive partitioned by Ubuntu installer.

Linux dom0 2.6.24-12-xen #1 SMP Thu Mar 13 01:23:51 UTC 2008 i686 GNU/Linux

Built a test domU system:

sudo xen-create-image --verbose --accounts --cache --image=full --size=4Gb --swap=1024Mb --memory=256Mb --dhcp --dir=/xen --hostname=domU

Trying to start it up gives this error:

sysadmin@dom0:~$ sudo xm create -c /etc/xen/domU.cfg
Using config file "/etc/xen/domU.cfg".
Error: Device 2049 (vbd) could not be connected. losetup /dev/loop0 /xen/domains/domU/swap.img failed

However Xen seems to be fibbing:

sysadmin@dom0:~$ sudo losetup -a
/dev/loop0: [fe00]:1015812 (/xen/domains/domU/swap.img)
/dev/loop1: [fe00]:1015813 (/xen/domains/domU/disk.img)

With the default xen-tools.conf the device names are sdX, I tried changing them to xvdX but this did not fix the problem. Of course before trying an 'xm create' again I ran

sudo losetup -d /dev/loop0
sudo losetup -d /dev/loop1

So losetup works, and I can loopback-mount the disk.img filesystem onto /mnt without any problems.

The following are xen-related entries from /var/log/syslog at the time I was trying to start this domU.

Mar 23 17:30:59 dom0 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/2049
Mar 23 17:30:59 dom0 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/2050
Mar 23 17:30:59 dom0 logger: /etc/xen/scripts/vif-bridge: online XENBUS_PATH=backend/vif/1/0
Mar 23 17:31:00 dom0 logger: /etc/xen/scripts/block: Writing backend/vbd/1/2049/hotplug-error losetup /dev/loop0 /xen/domains/domU/swap.img failed backend/vbd/1/2049/hotplug-status error to xenstore.
Mar 23 17:31:00 dom0 logger: /etc/xen/scripts/block: losetup /dev/loop0 /xen/domains/domU/swap.img failed
Mar 23 17:31:00 dom0 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge online for vif1.0, bridge eth1.
Mar 23 17:31:00 dom0 logger: /etc/xen/scripts/vif-bridge: Writing backend/vif/1/0/hotplug-status connected to xenstore.
Mar 23 17:31:01 dom0 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/console/1/0
Mar 23 17:31:01 dom0 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/1/2049
Mar 23 17:31:01 dom0 logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/1/0
Mar 23 17:31:01 dom0 logger: /etc/xen/scripts/block: Writing backend/vbd/1/2049/hotplug-error xenstore-read backend/vbd/1/2049/node failed. backend/vbd/1/2049/hotplug-status error to xenstore.
Mar 23 17:31:02 dom0 logger: /etc/xen/scripts/block: xenstore-read backend/vbd/1/2049/node failed.
Mar 23 17:31:02 dom0 logger: /etc/xen/scripts/block: Writing backend/vbd/1/2049/hotplug-error /etc/xen/scripts/block failed; error detected. backend/vbd/1/2049/hotplug-status error to xenstore.
Mar 23 17:31:02 dom0 logger: /etc/xen/scripts/block: /etc/xen/scripts/block failed; error detected.
Mar 23 17:31:02 dom0 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/vbd/1/2049
Mar 23 17:31:03 dom0 logger: /etc/xen/scripts/vif-bridge: brctl delif eth1 vif1.0 failed
Mar 23 17:31:03 dom0 logger: /etc/xen/scripts/vif-bridge: ifconfig vif1.0 down failed
Mar 23 17:31:03 dom0 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge offline for vif1.0, bridge eth1.
Mar 23 17:31:03 dom0 logger: /etc/xen/scripts/xen-hotplug-cleanup: XENBUS_PATH=backend/vif/1/0

/var/log/xen/xen-hotplug.log tells me unhelpful things like this:

xenstore-read: couldn't read path backend/vbd/8/2049/node
xenstore-read: couldn't read path backend/vbd/8/2050/node

And the last section of /var/log/xen/xend.log:

[2008-03-23 17:31:00 4665] DEBUG (DevController:150) Waiting for devices vif.
[2008-03-23 17:31:00 4665] DEBUG (DevController:155) Waiting for 0.
[2008-03-23 17:31:00 4665] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vif/1/0/hotplug-status.
[2008-03-23 17:31:00 4665] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vif/1/0/hotplug-status.
[2008-03-23 17:31:00 4665] DEBUG (DevController:608) hotplugStatusCallback 1.
[2008-03-23 17:31:00 4665] DEBUG (DevController:150) Waiting for devices vbd.
[2008-03-23 17:31:00 4665] DEBUG (DevController:155) Waiting for 2049.
[2008-03-23 17:31:01 4665] DEBUG (DevController:594) hotplugStatusCallback /local/domain/0/backend/vbd/1/2049/hotplug-status.
[2008-03-23 17:31:01 4665] DEBUG (DevController:608) hotplugStatusCallback 2.
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1913) XendDomainInfo.destroy: domid=1
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1930) XendDomainInfo.destroyDomain(1)
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1548) Destroying device model
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1555) Releasing devices
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1561) Removing vif/0
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:590) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1561) Removing vbd/2049
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:590) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/2049
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1561) Removing vbd/2050
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:590) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/2050
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:1561) Removing console/0
[2008-03-23 17:31:01 4665] DEBUG (XendDomainInfo:590) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0

So I'm not really sure where my problem is. xenstore? /etc/xen/scripts/block? /etc/xen/scripts/xen-hotplug-common.sh?

Can someone please help. It is driving me seriously up the wall that I cannot build a working Xen server with Ubuntu.

Andrew S.

Todd Deshane (deshantm) wrote :

Did you try tap:aio instead of file: in the guest config file?

Also, there have been a number of problems fixed, but a couple lingering issues with the latest packages.

see:
https://bugs.launchpad.net/ubuntu/+source/xen-3.2

also, make sure that the modules (loaded by /etc/init.d/xend are being loaded properly)
for some reference see:
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/199533

Todd Deshane (deshantm) wrote :

I can confirm this behavior.

I think that it is *also* a bug with xen-tools since it should use tap:aio by default instead of file:/

changing to tap:aio lets the guest boot, but brings me back to this bug:

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/144631

Changed in xen-3.2:
status: New → Confirmed
Todd Deshane (deshantm) wrote :

Additional information:

Attaching bootup process that dies at:
   0.242383] /build/buildd/linux-2.6.24/debian/build/custom-source-xen/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[ 0.242392] Freeing unused kernel memory: 216k freed
Loading, please wait...
Begin: Loading essential drivers... ...
[ 0.459390] fuse init (API version 7.9)
[ 0.478378] thermal: Unknown symbol acpi_processor_set_thermal_limit
Done.
Begin: Running /scripts/init-premount ...
Done.
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Done.
Begin: Waiting for root file system... ...
Done.
 Check root= bootarg cat /proc/cmdline
 or missing modules, devices: cat /proc/modules ls /dev
ALERT! /dev/sda2 does not exist. Dropping to a shell!

BusyBox v1.1.3 (Debian 1:1.1.3-5ubuntu12) Built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)

complete boot process and output of the following atached:

cat /proc/cmdline, cat /proc/modules and ls /dev

Todd Deshane (deshantm) wrote :

attaching xenguest1.cfg

Todd Deshane (deshantm) wrote :

My best guess is that it is a problem with the xen initrd (/boot/initrd.img-2.6.24-12-xen).

It seems to be only loading the fuse module and not the SCSI modules or any of the xen modules.

The following are loaded by xend:
blktap
blkbk
xenblk
netloop
netbk

Do any of these and possible also SCSI modules need to be explicitly in the initrd too?

I tried adding the following to /etc/initramfs-tools/modules and then running update-initramfs -u

sd_mod
scsi_mod
blktap
blkbk
xenblk
netloop
netbk

The following similar thing happens:

[ 0.215498] Freeing unused kernel memory: 216k freed
Loading, please wait...
Begin: Loading essential drivers... ...
[ 0.433778] SCSI subsystem initialized
[ 0.434788] Driver 'sd' needs updating - please use bus_type methods
[ 0.458819] register_blkdev: cannot get major 8 for sd
[ 0.458836] vbd vbd-2049: 19 xlvbd_add at /local/domain/0/backend/tap/3/2049
[ 0.459117] register_blkdev: cannot get major 8 for sd
[ 0.459128] vbd vbd-2050: 19 xlvbd_add at /local/domain/0/backend/tap/3/2050
[ 5.547967] XENBUS: Waiting for devices to initialise: 295s...290s...285s...280s...275s...270s...265s...260s...255s...250s...245s...240s...235s...230s...225s...220s...215s...210s...205s...200s...195s...190s...185s...180s...175s...170s...165s...160s...155s...150s...145s...140s...135s...130s...125s...120s...115s...110s...105s...100s...95s...90s...85s...80s...75s...70s...65s...60s...55s...50s...45s...40s...35s...30s...25s...20s...15s...10s...5s...0s...
[ 298.329595] XENBUS: Device not ready: device/vbd/2049
[ 298.329599] XENBUS: Device not ready: device/vbd/2050
[ 298.348441] fuse init (API version 7.9)
[ 298.373876] thermal: Unknown symbol acpi_processor_set_thermal_limit
Done.
Begin: Running /scripts/init-premount ...
Done.
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Done.
Begin: Waiting for root file system... ...
Done.
 Check root= bootarg cat /proc/cmdline
 or missing modules, devices: cat /proc/modules ls /dev
ALERT! /dev/sda2 does not exist. Dropping to a shell!

BusyBox v1.1.3 (Debian 1:1.1.3-5ubuntu12) Built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) cat /proc/modules
fuse 56368 1 - Live 0xffffffff8808c000
netbk 111600 0 [permanent], Live 0xffffffff8806f000
netloop 8960 0 - Live 0xffffffff8806b000
xenblk 22496 0 - Live 0xffffffff88064000
blkbk 26936 0 [permanent], Live 0xffffffff8805c000
blktap 123908 0 [permanent], Live 0xffffffff8803c000
xenbus_be 5888 3 netbk,blkbk,blktap, Live 0xffffffff88039000
sd_mod 33280 0 - Live 0xffffffff8802f000
scsi_mod 179000 1 sd_mod, Live 0xffffffff88002000
(initramfs)

I also noticed that /var/log/xen/xen-hotplug.log has:
Nothing to flush.
xenstore-read: couldn't read path backend/vbd/1/2049/node
xenstore-read: couldn't read path backend/vbd/1/2050/node
Nothing to flush.
xenstore-read: couldn't read path /local/domain/2/vm
xenstore-read: couldn't read path /local/domain/2/vm
Nothing to flush.

I still think it might have to do with the initrd, but not sure what exactly. Building from source did work, but it would be better to be able to use the ubuntu packages.

Todd Deshane (deshantm) wrote :

UPDATE:

changing sda to hda in the guest config

gets a little farther into the boot:

[ 1.252536] thermal: Unknown symbol acpi_processor_set_thermal_limit
Done.
Begin: Running /scripts/init-premount ...
Done.
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Done.
Begin: Waiting for root file system... ...
Done.
Begin: Running /scripts/local-premount ...
/scripts/local-premount/resume: /scripts/local-premount/resume: 57: log_begin_msg: not found
/scripts/local-premount/resume: /scripts/local-premount/resume: 57: log_end_msg: not found
Done.
[ 6.782336] kjournald starting. Commit interval 5 seconds
[ 6.782349] EXT3-fs: mounted filesystem with ordered data mode.
Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
INIT: version 2.86 booting
* Mount point '/dev/shm' does not exist. Skipping mount.
Activating swap...done.
Checking root file system...fsck 1.40-WIP (14-Nov-2006)
/lib/init/rw/rootdev: Superblock last mount time is in the future. FIXED.
/lib/init/rw/rootdev: Superblock last write time is in the future. FIXED.
/lib/init/rw/rootdev: clean, 10896/524288 files, 118298/1048576 blocks
done.
[ 8.028390] EXT3 FS on hda2, internal journal
Setting the system clock..
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access method.
Cleaning up ifupdown....
Loading kernel modules...done.
Loading device-mapper support[ 8.393666] device-mapper: uevent: version 1.0.3
[ 8.393690] device-mapper: ioctl: 4.12.0-ioctl (2007-10-02) initialised: <email address hidden>
.
Checking file systems...fsck 1.40-WIP (14-Nov-2006)
done.
Setting kernel variables...done.
Mounting local filesystems...done.
Activating swapfile swap...done.
Setting up networking....
Configuring network interfaces...Internet Systems Consortium DHCP Client V3.0.4
Copyright 2004-2006 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/

SIOCSIFADDR: No such device
eth0: ERROR while getting interface flags: No such device
eth0: ERROR while getting interface flags: No such device
Bind socket to interface: No such device
Failed to bring up eth0.
done.
INIT: Entering runlevel: 2
Starting system log daemon: syslogd.
Starting kernel log daemon: klogd.
* Not starting internet superserver: no services enabled.
Starting OpenBSD Secure Shell server: sshd.
Starting periodic command scheduler: crond.
<freezes here>

adding the extra='xencons=tty' gives a login prompt.

This leads to the networking bug(s)...

I don't (yet) have an eth0 like those in this bug have:

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/204010

Todd Deshane (deshantm) wrote :

ok so a modprobe xennet gives me an eth0...

no connectivity yet though.

There are several strange bugs documented here, but I will move to

bug # 204010 to see if we can figure out the network stuff

deti (deti) wrote :

The new kernel package 2.6.24-14 fixed this problem for me. Can anyone confirm this?
The networking bug #204010 seems not to be fixed by now.

deti said:
> The new kernel package 2.6.24-14 fixed this problem for me. Can anyone confirm this?
> The networking bug #204010 seems not to be fixed by now.

Not fixed for me by this package, no. =/

Andrew.

--
Andrew Shugg <email address hidden> http://www.neep.com.au/

"Just remember, Mr Fawlty, there's always someone worse off than yourself."
"Is there? Well I'd like to meet him. I could do with a good laugh."

Todd Deshane (deshantm) wrote :

It is actually fixed for me, but with the workarounds listed above:

sda --> hda

file:/ --> tap:aio/

a way to get the console:

      adding extra='xencons=tty' to domU config

      OR

      adding xvc0 to /etc/inittab in domU file system

The bugs above could be fixed by fixing xen-tools and/or are different separate bugs.

So, the bug as filed has basically been fixed and has workarounds for things that are broken otherwise

Andrew Shugg (ashugg) wrote :

Todd Deshane said:
> It is actually fixed for me, but with the workarounds listed above:
>
> sda --> hda
>
> file:/ --> tap:aio/

You're right, with the following config I have a working domU - as far
as the block devices are concerned anyway. (I did not have the hda/sda
problem: strange.)

kernel = '/boot/vmlinuz-2.6.24-15-xen'
ramdisk = '/boot/initrd.img-2.6.24-15-xen'
memory = '256'
root = '/dev/xvda2 ro'
disk = [
                  'tap:aio:/xen/domains/domU/swap.img,xvda1,w',
                  'tap:aio:/xen/domains/domU/disk.img,xvda2,w',
              ]
name = 'domU'
dhcp = 'dhcp'
vif = [ 'mac=00:16:3E:6C:07:90' ]
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
extra = '2 console=xvc0'

Now I have to look at what you and other people have done to get a
functional console (got boot output but no getty) and networking.

> The bugs above could be fixed by fixing xen-tools and/or are different
> separate bugs.

And indeed xen-tools (3.8-4ubuntu4) has slipped in quietly to fix the
xvda problem and to add support for hardy guests. It hasn't fixed the
tap:aio: instead of file: yet.

> So, the bug as filed has basically been fixed and has workarounds for
> things that are broken otherwise

Agreed, I suppose.

Andrew.

--
Andrew Shugg <email address hidden> http://www.neep.com.au/

"Just remember, Mr Fawlty, there's always someone worse off than yourself."
"Is there? Well I'd like to meet him. I could do with a good laugh."

kochab (kochab-gmail) wrote :

I've a problem similar to what Todd found in:
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/205450/comments/3

I've filed a separate bug report here:
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/214821

I'm using lvm instead of loopback image, and I've tried to change to hda without success.

Todd says:
"I still think it might have to do with the initrd, but not sure what exactly. Building from source did work, but it would be better to be able to use the ubuntu packages."
what package should I build from source? with specific options?
thanks

Wido den Hollander (wido) wrote :

I can confirm this.

I added this to my initramfs modules:

sd_mod
scsi_mod
blkbk
xenblk
netloop
netbk
xennet

After i build by initramfs and uses hda1 in my dom0.cfg the guest starts and the network works

Andrew Shugg (ashugg) wrote :

In linux-image-2.6.24-16-xen (2.6.24-16.30) I am back to the old problem:

sysadmin@dom0:~$ sudo xm create -c /etc/xen/gutsy.cfg
Using config file "/etc/xen/gutsy.cfg".
Error: Device 51713 (vbd) could not be connected. losetup /dev/loop4 /xen/domains/gutsy/swap.img failed
sysadmin@dom0:~$

Possibly this is because all the /lib/modules/`uname -r`/kernel/drivers/xen modules are missing... more detail here:

https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/199533/comments/41

deti (deti) wrote :

Same here. Is there any known fix or workaround?

deti (deti) wrote :

The "tap:aio:" and "xvda" workaround still seems to work - but should this be the solution for upcoming 8.04 LTS version? Still networking does not work either... this is depressing!

Todd Deshane (deshantm) wrote :

I don't the file:/ is a high priority. I am surprised that it doesn't work, I bet it is just a module not loading or something similar.

deti: Did you notice that in the network bug that someone posted a patch and some links to custom packages, you might want to try those.

Todd Deshane wrote:
> I don't the file:/ is a high priority. I am surprised that it doesn't
> work, I bet it is just a module not loading or something similar.
>
> deti: Did you notice that in the network bug that someone posted a patch
> and some links to custom packages, you might want to try those.
>
I know and I can confirm the packages are working well. It's just little
annoying that we'll have to get packages from somewhere else than from
the official repository.
Now after the code freeze it's very likely that we'll have to update
8.04 first before using XEN. That would be no real 'out of the box'
experience... - but I understand well that it's a hard job bringing out
a new release of a distribution where everything is working at the first
glance.

brainstorm (brainstorm) wrote :

I can confirm the bug and I've tried every single workaround/fix/variant of domU config files with no success (waiting to mount root filesystem)... is there a unified/unique way to deal with this issue ?

Todd Deshane (deshantm) wrote :

another workaround:

//understand losetup -f and -a
$ sudo losetup -f
$ sudo losetup -a

//actually do the mount and check that it worked and which device it picked
$ sudo losetup `sudo losetup -f` <iso_file>
$ sudo losetup -a

//then use phy:/dev/loop<num> in your xen config file

I wouldn't recommend this for disk partition, use tap:aio: instead.

But for cdrom image files, it is a good workaround.

Also, remember that file: is deprecated due to the problems with loopback devices
which is what this workaround uses, which is why you should only use it for iso (read-only) files.

brainstorm (brainstorm) wrote :

Thanks, fortunately it was just a "kept back" packages issue:

1) I installed hardy heron when it was devel
2) During subsequent upgrades, the 2.6...*-15* kernel was kept not upgrading to *-16*
3) apt-get dist-upgrade did the trick
4) tap:aio fixed the remaining vbd issue :)

Thanks !

Matey (mehdi-1) wrote :

I have had this problem for a while too. (swap.img failed error)?
I looked where Andrew Shugg said in here;
https://bugs.launchpad.net/ubuntu/+source/xen-3.2/+bug/199533/comments/41
and sure enough none of those files exist. the whole directory of xen is missing indeed?!
So whats the cure?
where can I get those files:
ls: cannot access /lib/modules/2.6.24-21-xen/kernel/drivers/xen/: No such file or directory
(But there are a lot of other folders under "drivers" directory from acpi to watchdog)!

BTW, I did apt-get dist-upgrade to no avail?!

Thanks!
mehdi

Confirming because it happens to several users.

Changed in xen-tools (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers