NIC doesn't work when it had been used before

Bug #754591 reported by Yongjie Ren
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

Environment:
------------
Host OS (ia32/ia32e/IA64): All
Guest OS (ia32/ia32e/IA64): All
Guest OS Type (Linux/Windows):All
kvm.git Commit:3b840cc27e5b831a26e3303096b88e112d1cf59f
qemu-kvm Commit:df85c051d780bca0ee2462cfeb8ef6d9552a19b0
Host Kernel Version:2.6.38+
Hardware:Westmere-EP / Sandy Bridge

Bug detailed description:
--------------------------
NIC doesn't work when it had been used before.
Statically assign a NIC to guest, it works well. Then shutdown the guest, and
assign the same NIC to a guest again. The NIC doesn't work. The dmesg of the
guest is as following.
--
Intel(R) Gigabit Ethernet Network Driver - version 2.1.0-k2
Copyright (c) 2007-2009 Intel Corporation.
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
igb 0000:00:03.0: PCI INT A -> Link[LNKC] -> GSI 11 (level, high) -> IRQ 11
igb 0000:00:03.0: setting latency timer to 64
  alloc irq_desc for 24 on node -1
  alloc kstat_irqs on node -1
igb 0000:00:03.0: irq 24 for MSI/MSI-X
  alloc irq_desc for 25 on node -1
  alloc kstat_irqs on node -1
igb 0000:00:03.0: irq 25 for MSI/MSI-X
  alloc irq_desc for 26 on node -1
  alloc kstat_irqs on node -1
igb 0000:00:03.0: irq 26 for MSI/MSI-X
igb 0000:00:03.0: PHY reset is blocked due to SOL/IDER session.
igb 0000:00:03.0: The NVM Checksum Is Not Valid
igb 0000:00:03.0: PCI INT A disabled
igb: probe of 0000:00:03.0 failed with error -5

Reproduce steps:
----------------
1.statically assign a NIC to a guest, check if it works well
2.shutdown the guest.
3.statically assign the same NIC to a guest (NIC doesn't work now).

Tags: vt-d
Revision history for this message
Yongjie Ren (yongjie-ren) wrote :
Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

qemu-kvm.git
bad:df85c051d780bca0ee2462cfeb8ef6d9552a19b0 and 9488459ff2ab113293586c1c36b1679bb15deee
good:2c9bb5d4e5ae3b12ad71bd6a0c1b32003661f53a

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

hot-plug NIC doesn't have this issue.

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

Are you using libvirt to launch the guest or qemu command line? Please provide xml/command line used. Are you able to reproduce this on a stable kernel? The referenced kvm.git commit is 2.6.29-rc2+, which I'm not even able to get to boot on my system. I can't reproduce the problem using command line qemu invocation on a RHEL6.x host kernel.

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

Nevermind, reproduced on 2.6.38

Revision history for this message
flypen (flypen) wrote :

I found this problem, too. A VM used a NIC with PCI-passthrough mode. Then I shutdown the VM, and tried to give the NIC back to the host. But I found I couldn't assign the NIC to the host again. This problem didn't occur each time. If I did this steps for ten times, I could see it at least once.

The kernel version was 2.6.32. I used Intel e1000e driver for the NIC. Here are some of the logs:
kernel: e1000e 0000:02:00.1: enabling device (0000 -> 0002)
kernel: e1000e 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
kernel: 0000:02:00.1: 0000:02:00.1: PHY reset is blocked due to SOL/IDER session.
kernel: 0000:02:00.1: 0000:02:00.1: The NVM Checksum Is Not Valid
 kernel: e1000e 0000:02:00.1: PCI INT B disabled

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

I can reproduce it stably. I use igb driver for my NIC. And Alex Williamson is on the way to fix it.

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

It was fixed in kvm.git.
This bug is fixed in the kvm upstream. It was fixed in kvm.git f8fcfd7 by
<email address hidden>.

commit f8fcfd775523347afe460dc3a0f45d0479e784a2
Author: Alex Williamson <email address hidden>
Date: Tue May 10 10:02:39 2011 -0600

    KVM: Use pci_store/load_saved_state() around VM device usage

    Store the device saved state so that we can reload the device back
    to the original state when it's unassigned. This has the benefit
    that the state survives across pci_reset_function() calls via
    the PCI sysfs reset interface while the VM is using the device.

    Signed-off-by: Alex Williamson <email address hidden>
    Acked-by: Avi Kivity <email address hidden>
    Signed-off-by: Jesse Barnes <email address hidden>

Changed in qemu:
status: New → Fix Committed
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.