CONFIG_XEN_PLATFORM_PCI should be "y" when building 3.1 kernel

Bug #886521 reported by Boris Derzhavets
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Andy Whitcroft
Natty
Fix Released
Medium
Stefan Bader
Oneiric
Fix Released
Medium
Stefan Bader
Precise
Fix Released
Medium
Andy Whitcroft

Bug Description

SRU justification:

Impact: Xen HVM guests (since around 2.6.36) will by default disable (unplug) the emulated devices in favour of the paravirtualized drivers when support for both is present (module or built-in). This would make Xen installs more complicated (even if we put them into a udeb because they are not getting autoprobed).

Fix: Change the configurations to have those (and the platform driver which is recommended and actually non-configurable in 3.2 and later) built-in.

Testcase: Installing a normal cdimage in HVM mode.

---

It allows to load PVHVM domain having xen_platform_pci=1 in python hvm-profile.
Also xen-blkfront and xen-netfront drivers are not included into inird.img on ISO disk,
what requires (at first boot) :-
(initramfs) modprobe xen-blkfront
(initramfs) modprobe xen-netfront
(initramfs) exit
to continue loading PV-on-HVM domain.
---
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
DistroRelease: Ubuntu 11.10
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
Package: linux (not installed)
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
Tags: oneiric running-unity
Uname: Linux 3.1.0-030100-generic x86_64
UnreportableReason: The running kernel is not an Ubuntu kernel
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

affects: ubuntu → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 886521

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
description: updated
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

root@boris-System-P5Q3:~# uname -a
Linux boris-System-P5Q3 3.1.0-030100-generic #201110241006 SMP Mon Oct 24 14:07:10 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Yes , i am running 3.1 Ubuntu's kernel for Dom0. I also believe that's a trivial issue and kernel developer have
nothing to do with that. That's why, I actually, file a bug against Ubuntu ( 12.04)

tags: added: apport-collected oneiric running-unity
description: updated
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

apport-collect 886521 has been run on Xen Host where I tested Oneiric with Ubuntu's 3.1 kernel as PV-on-HVM domain

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Test with newer development kernel (3.0.0-12.20)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: kernel-bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.0.0-12.20
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

> We have noted that there is a newer version of the development kernel than the
>one you last tested when this
> issue was found.
> Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

I did all testing with the the most recent 3.1 Ubuntu's kernel in Dom0 and HVM DomU as well.
I am just saying that CONFIG_XEN_PLATFORM_PCI should be equal "y" as on Fedora 16 for instance.
Please , see how RH's development tuned mainstream kernel for Fedora 16. That's an excellent sample.

You might loose option loading Precise in PVHVM domain, because not every customer can rebuild
kernel in Ubuntu style, e.g. reproduce :-
http://blog.avirtualhome.com/2011/10/28/how-to-compile-a-new-ubuntu-11-10-oneiric-kernel/
what I did to load PV-on-HVM. I will attach dmesg report for PVHVM domain loaded with 3.1 Ubuntu's
kernel having hard linked xen-platform-pci driver for better understanding what i am talking about.
Extract from dmesg report at PVHVM DomU :-

[ 0.000000] DMI: Xen HVM domU, BIOS 4.1.2 10/23/2011
[ 0.000000] Hypervisor detected: Xen HVM
[ 0.000000] Xen version 4.1.
[ 0.000000] Xen Platform PCI: I/O protocol version 1
[ 0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
[ 0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
[ 0.000000] You might have to change the root device
[ 0.000000] from /dev/hd[a-d] to /dev/xvd[a-d]
[ 0.000000] in your root= kernel command line option

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

View also config-3.1.0-7.fc16.x86_64

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Changes done to kernel been rebuilt .
--- xenconf.3.1.0-030100-generic 2011-11-06 11:03:39.225828270 +0400
+++ xenconf.3.1.0-2-vnc 2011-11-06 11:04:58.621827223 +0400
@@ -16,19 +16,20 @@
 CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
 CONFIG_HVC_XEN=y
 CONFIG_XEN_WDT=m
-CONFIG_XEN_FBDEV_FRONTEND=m
+CONFIG_XEN_FBDEV_FRONTEND=y
 CONFIG_XEN_BALLOON=y
-# CONFIG_XEN_SELFBALLOONING is not set
+CONFIG_XEN_SELFBALLOONING=y
+CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
 CONFIG_XEN_SCRUB_PAGES=y
 CONFIG_XEN_DEV_EVTCHN=m
 CONFIG_XEN_BACKEND=y
 CONFIG_XENFS=m
 CONFIG_XEN_COMPAT_XENFS=y
 CONFIG_XEN_SYS_HYPERVISOR=y
-CONFIG_XEN_XENBUS_FRONTEND=m
+CONFIG_XEN_XENBUS_FRONTEND=y
 CONFIG_XEN_GNTDEV=m
 CONFIG_XEN_GRANT_DEV_ALLOC=m
-CONFIG_XEN_PLATFORM_PCI=m
+CONFIG_XEN_PLATFORM_PCI=y
 CONFIG_SWIOTLB_XEN=y
 CONFIG_XEN_TMEM=y
 CONFIG_XEN_PCIDEV_BACKEND=m

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Reproduced in Dom0
host : boris-System-P5Q3
release : 3.0.0-12-generic
version : #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011
machine : x86_64
nr_cpus : 4
nr_nodes : 1
cores_per_socket : 4
threads_per_core : 1
cpu_mhz : 2833
hw_caps : bfebfbff:20100800:00000000:00000940:0008e3fd:00000000:00000001:00000000
virt_caps : hvm
total_memory : 8190
free_memory : 3296
free_cpus : 0
xen_major : 4
xen_minor : 1
xen_extra : .2
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
xen_commandline : placeholder
cc_compiler : gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)
cc_compile_by : bderzhavets
cc_compile_domain : yahoo.com
cc_compile_date : Sun Oct 23 09:14:14 UTC 2011
xend_config_format : 4

Oneiric PV-on-HVM loaded with kernel 3.0.0-13-pvhvm. dmesg log for DomU attached

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

I believe, that patch above has to be applied to .config of any mainstream kernel starting
with 3.0 , e.g 3.1,3.2. Otherwise , such feature as Ubuntu PVHVM DomUs will be available
only for customers responsible for kernel rebuild in Ubuntu's style.

tags: added: patch
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

One more suggestion - please, include xen-blkfront, xen-netfront drivers into initrd.img on ISO.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Official commit for 3.2

commit 5fbdc10395cd500d6ff844825a918c4e6f38de37
Author: Stefano Stabellini <email address hidden>
Date: Thu Sep 29 12:05:58 2011 +0100

    xen: remove XEN_PLATFORM_PCI config option

    Xen PVHVM needs xen-platform-pci, on the other hand xen-platform-pci is
    useless in any other cases.
    Therefore remove the XEN_PLATFORM_PCI config option and compile
    xen-platform-pci built-in if XEN_PVHVM is selected.

    Changes to v1:

    - remove xen-platform-pci.o and just use platform-pci.o since it is not
    externally visible anymore.

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

I can confirm that no Ubuntu kernels (excepting some pre-Lucid) will boot on Xen4 in HVM mode if dom0 is configured to allow PCI access (the norm, and what you would want in case your guest OS's need it). What happens is early on in the boot process, the pci unplug stuff is run, resulting in /dev/sdX being unplugged. The PV drivers will not load due to no platform_pci module. So the OS will not boot. It is possible to fix this using command line magic (xen_emul_unplug=never from memory) but this is not practical in a cloud environment where users just expect their operating systems to install and run.

Another option would be putting this module in the udeb (possibly - untested).

However, every other distribution that builds in Xen4 support also puts this as a builtin. Therefore every other distro boots first time.

In my view this is a bug in Lucid, Maverick, Natty, Oneiric and Precise.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Daily build of precise (11/25/11) is already running 3.2 kernel. Bug seems to be fixed upstream.

boris@ubuntu1204:~$ uname -a
Linux ubuntu1204 3.2.0-1-generic #3-Ubuntu SMP Tue Nov 22 11:30:27 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

boris@ubuntu1204:~$ cat /boot/config-3.2.0-1-generic | grep XEN
CONFIG_XEN=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PRIVILEGED_GUEST=y
CONFIG_XEN_PVHVM=y <= is selected
CONFIG_XEN_MAX_DOMAIN_MEMORY=128
CONFIG_XEN_SAVE_RESTORE=y
# CONFIG_XEN_DEBUG_FS is not set
CONFIG_PCI_XEN=y
CONFIG_XEN_PCIDEV_FRONTEND=m
CONFIG_XEN_BLKDEV_FRONTEND=y <=
CONFIG_XEN_BLKDEV_BACKEND=m
CONFIG_NETXEN_NIC=m
CONFIG_XEN_NETDEV_FRONTEND=y <=
CONFIG_XEN_NETDEV_BACKEND=m
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
CONFIG_HVC_XEN=y
CONFIG_XEN_WDT=m
CONFIG_XEN_FBDEV_FRONTEND=m
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SELFBALLOONING=y
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=m
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=m
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
CONFIG_XEN_GNTDEV=m
CONFIG_XEN_GRANT_DEV_ALLOC=m
CONFIG_SWIOTLB_XEN=y
CONFIG_XEN_TMEM=y
CONFIG_XEN_PCIDEV_BACKEND=m

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Boris, are you suggesting the for pre-3.2 kernels, the fix is to backport the 3.2 fix (i.e commit 5fbdc10395cd500d6ff844825a918c4e6f38de37, remove XEN_PLATFORM_PCI config option)? Whilst that would work, surely a less intrusive result would be achieved by configuring CONFIG_XEN_PLATFORM_PCI=y CONFIG_XEN_BLKDEV_FRONTEND=y, CONFIG_XEN_NETDEV_FRONTEND=y on all kernels where XEN_PVHVM=y?

From memory, I think on 3.1 Oneiric, CONFIG_XEN_BLKDEV_FRONTEND=y, CONFIG_XEN_NETDEV_FRONTEND=y are already fixed but CONFIG_XEN_PLATFORM_PCI=y is not. On Lucid, Maverick, Natty, from memory all 3 need fixing.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

(also, if we want to make sure install from DVD works, we should make sure the install DVD kernel, which from memory is -generic, has XEN_PVHVM=y and hence the other 3 options)

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

> Boris, are you suggesting the for pre-3.2 kernels, the fix is to backport the 3.2 fix (i.e commit
> 5fbdc10395cd500d6ff844825a918c4e6f38de37, remove XEN_PLATFORM_PCI config option)?

No , I am not suggesting this back port.
XEN_PLATFORM_PCI should be equal "y" for Oneiric kernels (3.0.X), (3.1.X)

> From memory, I think on 3.1 Oneiric, CONFIG_XEN_BLKDEV_FRONTEND=y,
> CONFIG_XEN_NETDEV_FRONTEND=y are already fixed

I don't think so. Please view :-

boris@boris-System-P5Q3:/boot$ cat config-3.0.0-13-generic | grep XEN
CONFIG_XEN=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PRIVILEGED_GUEST=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_MAX_DOMAIN_MEMORY=128
CONFIG_XEN_SAVE_RESTORE=y
# CONFIG_XEN_DEBUG_FS is not set
# CONFIG_XEN_DEBUG is not set
CONFIG_PCI_XEN=y
CONFIG_XEN_PCIDEV_FRONTEND=m
CONFIG_XEN_BLKDEV_FRONTEND=m <= not fixed
CONFIG_XEN_BLKDEV_BACKEND=m
CONFIG_NETXEN_NIC=m
CONFIG_XEN_NETDEV_FRONTEND=m <= not fixed
CONFIG_XEN_NETDEV_BACKEND=m
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=m
CONFIG_HVC_XEN=y
CONFIG_XEN_WDT=m
CONFIG_XEN_FBDEV_FRONTEND=m
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=m
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=m
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=m
CONFIG_XEN_GNTDEV=m
CONFIG_XEN_GRANT_DEV_ALLOC=m
CONFIG_XEN_PLATFORM_PCI=m
CONFIG_SWIOTLB_XEN=y

I am just saying , that due to Precise will go with 3.2.X kernel, issue is already closed upstream
I've already tested Precise (11/26/11) as PV-on-HVM at Xen 4.1.2 Oneiric Dom0.
It works as expected. Ubuntu has nothing to fix with this kernel.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Sorry, check 3.1.1 kernel on Oneiric :-
boris@boris-System-P5Q3:/boot$ cat config-3.1.1-030101-generic | grep XEN_BLKDEV_FRONTEND
CONFIG_XEN_BLKDEV_FRONTEND=m
boris@boris-System-P5Q3:/boot$ cat config-3.1.1-030101-generic | grep XEN_NETDEV_FRONTEND
CONFIG_XEN_NETDEV_FRONTEND=m

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

So, in summary, for all pre-3.2 kernels in all flavours that have CONFIG_XEN_PVHVM=y (i.e. the appropriate flavours in Lucid, Maverick, Natty, Oneiric) we need to modify/add:
   CONFIG_XEN_BLKDEV_FRONTEND=y
   CONFIG_XEN_NETDEV_FRONTEND=y
   CONFIG_XEN_PLATFORM_PCI=y

For kernels without CONFIG_XEN_PVHVM=y, we need do nothing, as they don't have PV driver support anyway.

For Precise, we don't need to do anything, as CONFIG_XEN_PLATFORM_PCI has been deprecated, and per comment #13, CONFIG_XEN_BLKDEV_FRONTEND and CONFIG_XEN_BLKDEV_FRONTEND have already moved from 'm' to 'y'.

Assuming that's correct, I might try knocking up some patches tomorrow.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

That's correct.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

On Oneiric (and I think Natty), the problem is that CONFIG_XEN_PVHM is defined as 'y' in the common config, but CONFIG_XEN_PLATFORM_PCI is defined as 'm' in the common config. In various flavours, CONFIG_XEN_NETDEV_FRONTEND and CONFIG_XEN_BLKDEV_FRONTEND are either defined as 'm' or 'y'. However, these will not work, as at boot time, we need CONFIG_XEN_PLATFORM_PCI to be 'y'. As this will cause the emulated devices to be unplugged, then we will always also need CONFIG_XEN_BLKDEV_FRONTEND=y and CONFIG_XEN_NETDEV_FRONTEND=y (or there will be no block dev or net dev to boot from).

So this patch takes the easy approach, and makes CONFIG_XEN_PLATFORM_PCI=y, CONFIG_XEN_NETDEV_FRONTEND=y and CONFIG_XEN_BLKDEV_FRONTEND=y in the common config. This may result in Xen stuff being compiled into certain flavours where it is not welcome (in which case those flavours should override the common config and set these to 'no'). I do not have the compile & test resources to test that.

This was developed against Oneiric, but visual inspection suggests the same needs to be done on Natty (but not Lucid or Maverick - see next comment).

THIS PATCH IS FOR COMMENT AND REVIEW ONLY. IT IS NOT EVEN COMPILE TESTED. DO NOT APPLY WITHOUT TESTING.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

On Lucid (and I think Maverick), the CONFIG_XEN_PVHM and CONFIG_XEN_PLATFORM_PCI settings are not defined, and indeed do not exist in the code. However, Lucid (and I think Maverick) still do not boot on Xen4 in HVM mode. I believe this is because these kernels behave in the same way as later kernels with CONFIG_XEN_PLATFORM_PCI=y and CONFIG_XEN_PVHVM=y. IE, early in the boot sequence the Xen code is writing to the magic port, and on the basis of the host domU supporting PCI, unplugging the sda and eth0 devices. The pv modules appear to be omitted from the various udebs and installers and/or the the install CDs., and CONFIG_XEN_NETDEV_FRONTEND and CONFIG_XEN_BLKDEV_FRONTEND are defined as 'm' in in the common config.

The manner to support these consistent with the Xen4 changes proposed on Natty and Oneiric, and the upstream plus common config fix already applied to Precise, is to build these modules as builtins (I think a change to the installer CDs and udeb builds might be possible, but that is inconsistent and complicated).

The attached patch therefore makes a simple change to the common config to make CONFIG_XEN_NETDEV_FRONTEND=y and CONFIG_XEN_BLKDEV_FRONTEND=y, replacing the previous value 'm' (both cases). I have not build-tested or install-tested this patch as I do not have the compile and test resources.

This was developed against Lucid, but visual inspection suggests the same needs to be done on Maverick (but not Oneiric or Natty - see previous comment).

No change is necessary on Precise.

THIS PATCH IS FOR COMMENT AND REVIEW ONLY. IT IS NOT EVEN COMPILE TESTED. DO NOT APPLY WITHOUT TESTING.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

> However, Lucid (and I think Maverick) still do not boot on Xen4 in HVM mode

Ubuntu 10.04.3 does boot at Xen 4.1 Oneiric Dom0 in HVM mode ( not PV-on-HVM)

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Hmm. So we have xen configuration files set up for PV-on-HVM (in case the guest OS can support it, as we have no way of knowing what the guest OS is, as that's provided by the client), and our experience is that Lucid does not install from the install CDs. Looking at the code, it can't be the 'unplug' stuff, as that is only introduced in the Natty kernel, unless the installer CD is trying to be clever by using the Natty backport kernel (which doesn't work for the reasons listed above). We assumed this was essentially the same problem, but perhaps we need to revisit what is happening.

What I think I would now expect is that the Lucid kernel does not contain PV-on-HVM PV drivers (i.e. CONFIG_XEN_BLKDEV_FRONTEND and CONFIG_XEN_NETDEV_FRONTEND are there for paravirtualised mode, not PV on HVM), so it should all run through emulated drivers, and install on /dev/sda, irrespective of the PV driver settings in the config file. Is that what you would expect too Boris? (that's what happens on, e.g. Centos 5). However that's not what's happening.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

> Is that what you would expect too Boris? (that's what happens on, e.g. Centos 5).
Yes.

> However that's not what's happening

I never put "xen_platform_pci=1( or 0)" in Lucid HVM python profile.
Just used usual HVM-profile. It works for me and via virt-install as well.
I don't remember when PV-on-HVM merged mainline linux ( <= 3.0 )

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

I'll get the config we used that is giving us Lucid/Maverick problems plus more details on what actually happens.

But at least we have Oneiric/Natty sorted.

Revision history for this message
Andy Whitcroft (apw) wrote :

As this option is already removed in Precise (such that it is always enabled) we can close this out as fixed there.

Changed in linux (Ubuntu Oneiric):
status: New → Triaged
importance: Undecided → Medium
Changed in linux (Ubuntu Precise):
status: Triaged → Fix Released
assignee: nobody → Andy Whitcroft (apw)
Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Andy: I have Diana Crisan our end figuring out what the problem is on Lucid and/or Maverick.

Revision history for this message
Stefan Bader (smb) wrote :

Also we changed block and netfrontend to be built in for precise. At least Lucid I was booting in the past on a xen4 dom0 in HVM mode. However usually using the emulated devices not the paravirtualized ones. Traditionally we only had the paravirt drivers only built-in for the virttual kernel package. Which would not be available/used when running an install in HVM mode. So as, theoretically the paravirt drivers should not be a problem on bare metal (no xen bus and hypervisor calls available), I would say it should do no harm (except a slightly bigger kernel) to have them built-in for all of x86.

Alex/Diana, can you attach the xen domU cfg and the domU dmesg for Lucid?

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Lucid runs fine (sounds like a testing error). We are testing Maverick (which we think will work) and Natty (which we think won't) now.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Stefan: To be clear, we are talking about domU HVM with PV drivers.

In Precise, the HVM drivers are built in for all flavours. This is logical because at least on domU installs which have an HVM backend (as any cloud will), the PCI unplug mechanism will occur whether or not you have PV driver modules available (i.e. whether 'y', 'm', or 'n') so otherwise you end up with no drivers. So I agree with you, built-in is the way to go for Natty and Oneiric (this is what my patch does).

If the concern about a bigger kernel wins the day, then I would suggest turning off XEN_PLATFORM_PCI (or perhaps XEN support in total) so the kernel will at least boot without PV drivers (i.e. disabling the PCI bridge so the unplug mechanism deliberately fails). But I think built-in is a much better answer.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Diana has confirmed Maverick works ok too (albeit without pv drivers, as expected). So this bug is solely Natty & Oneiric.

Revision history for this message
Stefan Bader (smb) wrote :

Alex, Diana, I think even with the built-in paravirt net and block drivers there is still a problem which I think I saw on oneiric and right now confirmed on natty. I think it may still be unresolved even upstream, but it was one issue I did not follow before oneiric release as there where other pieces not working.

Booting without any xen_emul_unplug option. With the built-in blkfront and netfront (and pci platform). All emulated devices get unplugged as expected. The xvd driver does take over and a disk is found. It looks like the netfront would take over as well but it is not set up. This only works if in the config, the vif is not using type=ioemu.

Can you see the same? If yes, I need to verify this with precise guest and host combinations and then this should get a new bug and be tracked there. Having the options changed for natty and oneiric should be independent. At least things get better.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Stefan,

We can test that (well Diana can :-) ). Some questions, as these bugs tend to have near infinite combinatorial complexity:

1. Do you have a ppa or something with the kernel that you are using?

2. Are you saying the virtual net driver works if it is configured type=iommu, or the net driver works if it is NOT configured type=iommu?

3. Just to check our test environment is the same as yours: (a) you are using an unpatched Oneiric kernel and xen4 installation on dom0, (b) have done made sure gntdev and evttable modules are loaded, (c) precise works fine as domU, but there are just problems with Natty and Oneiric domU.

4. Example xen config files of it working and not working would be helpful, and we should do the same.

I know we have got pv network working before now, I think on SLES.

Revision history for this message
Boris Derzhavets (bderzhavets) wrote : Re: [Bug 886521] Re: CONFIG_XEN_PLATFORM_PCI should be "y" when building 3.1 kernel

> It looks like the netfront would take over as well but it is not set up.
> This only works if in the config, the vif is not using type=ioemu.

Is it something surprising you ?

--- On Wed, 11/30/11, Stefan Bader <email address hidden> wrote:

From: Stefan Bader <email address hidden>
Subject: [Bug 886521] Re: CONFIG_XEN_PLATFORM_PCI should be "y" when building 3.1 kernel
To: <email address hidden>
Date: Wednesday, November 30, 2011, 12:11 PM

Alex, Diana, I think even with the built-in paravirt net and block
drivers there is still a problem which I think I saw on oneiric and
right now confirmed on natty. I think it may still be unresolved even
upstream, but it was one issue I did not follow before oneiric release
as there where other pieces not working.

Booting without any xen_emul_unplug option. With the built-in blkfront
and netfront (and pci platform). All emulated devices get unplugged as
expected. The xvd driver does take over and a disk is found. It looks
like the netfront would take over as well but it is not set up. This
only works if in the config, the vif is not using type=ioemu.

Can you see the same? If yes, I need to verify this with precise guest
and host combinations and then this should get a new bug and be tracked
there. Having the options changed for natty and oneiric should be
independent. At least things get better.

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/886521

Title:
  CONFIG_XEN_PLATFORM_PCI should be "y" when building  3.1 kernel

Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Oneiric:
  Triaged
Status in “linux” source package in Precise:
  Fix Released

Bug description:
  It allows to load PVHVM domain having xen_platform_pci=1 in python hvm-profile.
  Also xen-blkfront and xen-netfront drivers are not included into inird.img on ISO disk,
  what requires (at first boot) :-
  (initramfs) modprobe xen-blkfront
  (initramfs) modprobe xen-netfront
  (initramfs) exit
  to continue loading PV-on-HVM domain.
  ---
  ApportVersion: 1.23-0ubuntu4
  Architecture: amd64
  DistroRelease: Ubuntu 11.10
  InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
  Package: linux (not installed)
  ProcEnviron:
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  Tags:  oneiric running-unity
  Uname: Linux 3.1.0-030100-generic x86_64
  UnreportableReason: The running kernel is not an Ubuntu kernel
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/886521/+subscriptions

Revision history for this message
Gihan Munasinghe (gihan-v) wrote :

Guys

If I am getting this correctly what Stefan is talking about the default xen behaviour, net device emulation works bit differently than the disk in xen. If you give the vif "type= ioemu" the host dom0 will not even try to create a net_back device, and let qemu-dm to create qemu nic ( pretty much like what kvm does for non virtio devices )

This is the reason why you have to create 2 nics with the same mac address with "type= ioemu" and "type=xen (this can be anything except ioemu )", and I think this would be the correct behaviour, so the host does not need to know what OS is the guest running or weather guest have net_front drivers, and leave it up to the guest to pick what device path it wants to use.

I think you can not do this with xm config (python based client util), but you can easily do this in xl config (new C stuff), or using the libxl/libxen library.

Revision history for this message
Stefan Bader (smb) wrote :

@Boris, not completely surprising but different from the behaviour for the block device.

@Alex, Diana,

1. No I have no PPA, just compiled what is currently in the master-next branch for natty and oneiric with the
    config change applied.

2. The virtual net driver does *not* work with a device declared type=ioemu

3. a) Guest kernels from master-next git (ubuntu) plus built-in pv-drivers and pci platform.
         dom0 is standard Oneiric Xen4
    b) evttable? Maybe xen_evtchn? That is loaded. gntdev is not. Frankly things just worked
        without that so I never bothered to find out what that is required for again.
   c) All of them work fine as HVM when unplug is prevented. But precise works without that
       option but requires the net device be defined without type=ioemu.

4. Will do next.

@Gihan, did not know one could define two adapters with the same mac. scary... will try that
out...

Revision history for this message
Stefan Bader (smb) wrote :

So just generally trying to summarize my "issue" with Xen behaviour here:

For the block device, no matter whether you have xvd? or hd? in the cfg, there will be a pv and an emulated device. Depending on having the pci platform driver and the pv driver built-in or as module, xen will remove the emulated devices (without the unplug parameter). At least in that case the pv device does work and you got a working disk.

For the nic, there *is* a difference whether type=ioemu is given or not. When it is present, there is a vif on the xenbus, but there seems to be no mac address set and it does not work. Still xen just unplugs the emulated device, when both drivers are present somehow. And that leaves you without working network (without Gihan's trick).

<rant>
Unfortunately there seems to be a reason for this whole unplug abomination (don't really understand it, but it is likely historical). If my config file says hda, I want an emulated disk and when it says xvda, I want to use the pv driver. Same for the network device. Now we got a situation where half the OS decides and half the config needs to be "clever". And I think there is no way of having for example a 3.0 kernel running that uses the emulated disks without showing the pv disks as well (when the kernel supports both).
</rant>

Revision history for this message
Stefan Bader (smb) wrote :

I stripped the comments so it looks a bit simpler and smaller. And you likely do not want the keymap line. ;)

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Stefan,

Yes, the unplug "feature" was added in Xen between 3.3 and 3.4 (the magic port to support it), and it, um, somewhat lacks OS friendliness. The config file to achieve desired results is also non-obvious. It generates significant problems in moving VMs from Xen3 to Xen4, and also upgrading domU kernels (for instance, an upgrade from Lucid/Maverick to Natty/Oneiric/Precise will cause the emulated disk to disappear as well as the PV disk to appear under a different name, which will cause a reboot failure). In my opinion, it would have been better to leave the emulated disk there, possibly doing something to lock it (so it couldn't be mounted) or remove it if the PV disk is opened. It would also have been useful to give more control to the dom0 in what might and might not be unplugged. But "we are where we are".

As far as having a 3.0 kernel which accesses emulated disks but not PV disks, you can do this by disabling the whole of the PCI system, either from xen.conf or by blacklisting the module, or possibly by one of the command line boot scripts. But of course this will affect your net drivers too.

Alex

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Gihan/Diana: could you post a typical xen config file for xen4 that we use, which works with both PV and non-PV OS's?

Revision history for this message
Gihan Munasinghe (gihan-v) wrote :

Stefan

Attach is the config file for xen4. Normally I use the same config file for pv and non pv guest, and let the gust pick what device driver it want to use. Passing the disk as xvda will work with hvm guest with pv and nonpv. The config file is used with xl command.

As side note if you don't have "xen_gntdev" module loaded in the host (this is need for pv net drivers at least thats what I found ), you might need to modrobe it.

Revision history for this message
Stefan Bader (smb) wrote :

Seems this would only be accepted with xl, while xm complains. And yes, seems that xm works without gntdev loaded, but xl will fail to initialize any xenbus (pv) devices when it is not loaded before starting the instance... There has been a bug report about that. Cannot remember whether it only was for precise (but might make sense for oneiric too)

Stefan Bader (smb)
description: updated
Changed in linux (Ubuntu Natty):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu Oneiric):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
status: Triaged → In Progress
Revision history for this message
Boris Derzhavets (bderzhavets) wrote :

Stefan,

Have you tried profile with following entries :-

vif = [ 'type=ioemu, mac=00:16:3f:03:01:14, bridge=virbr0 ' ]
...
xen_platform_pci=0

to load Precise HVM with emulated devices.

Stefan Bader (smb)
Changed in linux (Ubuntu Oneiric):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Natty):
status: In Progress → Fix Committed
Revision history for this message
Stefan Bader (smb) wrote :

@Boris, no. But yes, it works. So emulated devices forceable through guest config. Pv dvices only default by os unplugging andboth forcable by os boot argument.

Btw, type=ioemu not working with unplugging seems to be limited to using the xm stack. xl seems to work for me (now that gntdev is loaded)

Revision history for this message
Stefan Bader (smb) wrote :

So it seems that type=ioemu is not only unnecessary, but also fails to work with xm. Testing with

vif = [ 'mac=..., bridge=...']

was working with both stacks (xm or xl) and would default to the emulated devices which could be forced to all emulated by changing the xen_platform_pci to 0.

Revision history for this message
Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel for Oneiric in -proposed solves the problem (3.0.0-15.24). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-oneiric' to 'verification-done-oneiric'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-oneiric
Revision history for this message
Diana Crisan (dcrisan) wrote :

We checked on our end on cloud images, both downgrading precise to 3.0.0-15.24 and upgrading the release version of oneiric to 3.0.0-15.24 and it worked as expected.

Thank you!

Diana Crisan (dcrisan)
tags: added: verification-done-oneiric
removed: verification-needed-oneiric
Revision history for this message
Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel for Natty in -proposed solves the problem (2.6.38-13.54). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-natty' to 'verification-done-natty'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-natty
Revision history for this message
Stefan Bader (smb) wrote :

Upgraded a natty HVM installation to the proposed kernel and it was then possible to use the pv drivers.

tags: added: verification-done-natty
removed: verification-needed-natty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.9 KiB)

This bug was fixed in the package linux - 3.0.0-15.25

---------------
linux (3.0.0-15.25) oneiric-proposed; urgency=low

  [Brad Figg]

  * Release Tracking Bug
    - LP: #910894

  [ Upstream Kernel Changes ]

  * Revert "clockevents: Set noop handler in clockevents_exchange_device()"
    - LP: #904569

linux (3.0.0-15.24) oneiric-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #903188

  [ Alex Bligh ]

  * (config) Change Xen paravirt drivers to be built-in
    - LP: #886521

  [ Chase Douglas ]

  * Revert "SAUCE: HID: hid-ntrig: add support for 1b96:0006 model"
    - LP: #724831
  * Revert "SAUCE: hid: ntrig: Remove unused device ids"
    - LP: #724831

  [ Seth Forshee ]

  * SAUCE: dell-wmi: Demote unknown WMI event message to pr_debug
    - LP: #581312

  [ Upstream Kernel Changes ]

  * Revert "leds: save the delay values after a successful call to
    blink_set()"
    - LP: #893741
  * xfs: Fix possible memory corruption in xfs_readlink, CVE-2011-4077
    - LP: #887298
    - CVE-2011-4077
  * drm/i915: fix IVB cursor support
    - LP: #893222
  * drm/i915: always set FDI composite sync bit
    - LP: #893222
  * jbd/jbd2: validate sb->s_first in journal_get_superblock()
    - LP: #893148
    - CVE-2011-4132
  * ALSA: hda - Don't add elements of other codecs to vmaster slave
    - LP: #893741
  * virtio-pci: fix use after free
    - LP: #893741
  * ASoC: Don't use wm8994->control_data in wm8994_readable_register()
    - LP: #893741
  * sh: Fix cached/uncaced address calculation in 29bit mode
    - LP: #893741
  * drm/i915: Fix object refcount leak on mmappable size limit error path.
    - LP: #893741
  * drm/nouveau: initialize chan->fence.lock before use
    - LP: #893741
  * drm/radeon/kms: make an aux failure debug only
    - LP: #893741
  * ALSA: usb-audio - Check the dB-range validity in the later read, too
    - LP: #893741
  * ALSA: usb-audio - Fix the missing volume quirks at delayed init
    - LP: #893741
  * KEYS: Fix a NULL pointer deref in the user-defined key type
    - LP: #893741
  * hfs: add sanity check for file name length
    - LP: #893741
  * drm/radeon: add some missing FireMV pci ids
    - LP: #893741
  * sfi: table irq 0xFF means 'no interrupt'
    - LP: #893741
  * x86, mrst: use a temporary variable for SFI irq
    - LP: #893741
  * b43: refuse to load unsupported firmware
    - LP: #893741
  * md/raid5: abort any pending parity operations when array fails.
    - LP: #893741
  * mfd: Fix twl4030 dependencies for audio codec
    - LP: #893741
  * xen:pvhvm: enable PVHVM VCPU placement when using more than 32 CPUs.
    - LP: #893741
  * xen-gntalloc: integer overflow in gntalloc_ioctl_alloc()
    - LP: #893741
  * xen-gntalloc: signedness bug in add_grefs()
    - LP: #893741
  * powerpc/ps3: Fix lost SMP IPIs
    - LP: #893741
  * powerpc: Copy down exception vectors after feature fixups
    - LP: #893741
  * backing-dev: ensure wakeup_timer is deleted
    - LP: #893741
  * block: Always check length of all iov entries in blk_rq_map_user_iov()
    - LP: #893741
  * Linux 3.0.10
    - LP: #893741
  * drm/i915: add multi-threaded forcewake support
    - LP: #891270
  * (pre-sta...

Changed in linux (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.38-13.54

---------------
linux (2.6.38-13.54) natty-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #911195

  [ Alex Bligh ]

  * (config) Change Xen paravirt drivers to be built-in
    - LP: #886521

  [ Paolo Pisati ]

  * [Config] DEFAULT_MMAP_MIN_ADDR=32k on arm
    - LP: #903346

  [ Seth Forshee ]

  * SAUCE: dell-wmi: Demote unknown WMI event message to pr_debug
    - LP: #581312

  [ Upstream Kernel Changes ]

  * VFS: Fix vfsmount overput on simultaneous automount
    - LP: #769927
  * TPM: Zero buffer after copying to userspace, CVE-2011-1162
    - LP: #899463
    - CVE-2011-1162
  * hfs: fix hfs_find_init() sb->ext_tree NULL ptr oops, CVE-2011-2203
    - LP: #899466
    - CVE-2011-2203
  * KEYS: Fix a NULL pointer deref in the user-defined key type,
    CVE-2011-4110
    - LP: #894369
    - CVE-2011-4110
  * nfsd4: permit read opens of executable-only files
    - LP: #833300
  * Support for Terratec G1
    - LP: #821061
 -- Herton Ronaldo Krzesinski <email address hidden> Tue, 03 Jan 2012 10:03:15 -0200

Changed in linux (Ubuntu Natty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.