Hypervisor with QEMU-2.0/libvirtd 1.2.2 stack when launching VM with CirrOS or Ubuntu 12.04

Bug #1353947 reported by Eyal Perry
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

The issue observed when running an hypervisor with QEMU 2.0/libvirtd 1.2.2
The VM network interface is attached to a PCI virtual function (SR-IOV).

When we ran VM with guest OS CirrOS or Ubuntu 12.04 we observed an hipervisor hang shortly after the VM is loaded
We observed the same issue with Mellanox NIC and with Intel NIC

We’ve tried few combinations of {GuestOS}X{Hypervisor} and we got the following findings:
When a hypervisor is running QEMU 1.5/libvirtd 1.1.1 - no issue observed
When a hypervisor is running QEMU 2.0/libvirtd 1.2.2 - CirrOS and Ubuntu 12.04 guest OSes caused hypervisor hang
When a hypervisor is running QEMU 2.0/libvirtd 1.2.2 - CentOS 6.4 and Ubuntu 13.10 - no issue observed

The problematic guest OSes are with kernel versions ~3.2.y

Revision history for this message
Eyal Perry (eyalpe) wrote :
information type: Private Security → Public
summary: - Hypervisor with QEMU-2.0 stack when launching VM with CirrOS or Ubuntu
- 12.04
+ Hypervisor with QEMU-2.0/libvirtd 1.2.2 stack when launching VM with
+ CirrOS or Ubuntu 12.04
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Sorry, I'm having trouble following your findings. Could you please give a new table, like
this:

======================================================================================================================
GuestOS | Guestkernel | HostOS | Hostkernel | qemu version | libvirt version | nic type | Pass/Fail
======================================================================================================================
cirros | ??? | 12.04 | 3.2 | 1.0+noroms-0ubuntu13 | 0.9.8-2ubuntu17 | intel SR-IOV | F
======================================================================================================================
cirros | ??? | 12.04 | 3.13 | 1.0+noroms-0ubuntu13 | 0.9.8-2ubuntu17 | intel SR-IOV | P
======================================================================================================================
(...0
======================================================================================================================

Ideally we could determine whether the kernel version is at all related, or whether it
is purely tied to qemu version.

Revision history for this message
Eyal Perry (eyalpe) wrote :

Hi Serge,
1. Please see the table below
2. We also observed this issue with Intel NICs
===============================================================================================================================================
GuestOS | Guestkernel | HostOS | Hostkernel | qemu version | libvirt version | nic type | Pass/Fail
===============================================================================================================================================
Ubuntu 12.04 | 3.2.0-63-generic | Ubuntu 12.04 | 3.11.0-18-generic | 2.0.0+dfsg-2ubuntu1.1 | 1.2.2-0ubuntu13.2 | Mellanox SR-IOV | F
===============================================================================================================================================
Ubuntu 13.10 | 3.11.0-26-generic | Ubuntu 12.04 | 3.11.0-18-generic | 2.0.0+dfsg-2ubuntu1.1 | 1.2.2-0ubuntu13.2 | Mellanox SR-IOV | P
===============================================================================================================================================

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks, so the problem appears to be that a feature is missing from the guest's precise kernel, which was present in the saucy (13.10) kernel.

Could you verify whether using the LTS backport kernel packages in the precies guest fixes the issue?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Note I don't believe this bug should be marked as affecting the QEMU project)

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1353947

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Eyal Perry (eyalpe) wrote :

Can't run apport-collect after the kernel hang

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Eyal Perry (eyalpe) wrote :

Hello Serge,
I agree that there is probably some missing feature in the guest kernel.
However, I don't believe it is acceptable for the hyper-visor kernel to be affected from this.
Actually, this is a security issue if a VM user can crash i't Host kernel and probably some other VMs as well.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Sorry, when the table only had the two entries i drew some bad assumptions from that, and I also missed the fact that hangs were in the host kernel.

To be clear, running qemu 1.5 on the same host kernel has no issues with any guest, while qemu 2.0 causes host kernel to hang (indefinately) with some guests?

Then indeed disregard my comment #5.

Is this 100% reproducible, every time? Would you be able to bisect qemu to figure out where the problem was introduced?

Revision history for this message
Eyal Perry (eyalpe) wrote :

Indeed, with qemu 1.5 we did not observed this issue at all.
Sorry, but I don't have the resources at the moment to do the bisecting.

Revision history for this message
Thomas Huth (th-huth) wrote :

Triaging old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays?

Changed in qemu:
status: New → Incomplete
Thomas Huth (th-huth)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.