NIC assignment order in command line make some NIC can't work

Bug #799036 reported by Yongjie Ren
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

Environment:
------------
Host OS (ia32/ia32e/IA64):All
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux
kvm.git Commit:681fb677ace0754589792c92b8dbee9884d07158
qemu-kvm Commit:05f1737582ab6c075476bde931c5eafbc62a9349
Host Kernel Version:3.0.0-rc2+
Hardware: Westmere-EP platform

Bug detailed description:
--------------------------
When using qemu-system-x86_64 to create a linux guest with a 82586 NIC and a 82572EI NIC statically assigned, the two NIC's order in command line may make 82586 NIC can't work in guest.
command1: qemu-system-x86_64 -m 1024 -smp 2 -device pci-assign,host=0e:00.1 -device pci-assign,host=0c:00.0 -net none -hda qcow-rhel6.img
command2: qemu-system-x86_64 -m 1024 -smp 2 -device pci-assign,host=0c:00.0 -device pci-assign,host=0e:00.1 -net none -hda qcow-rhel6.img
Using command1 to create a guest, both two NICs works well in guest.
While using command2 to create a guest, 82576(kawela) NIC 0c:00.0 cannot get IP in guest.
BDF 0c:00.0 is a 82576(kawela) NIC, while BDF 0e:00.1 is 82572EI NIC.
The only difference of the two command lines is the order of NICs assignment.
And only the 82576(kawela) NIC have this problem, the other one 82572EI always works well.
And sriov VF doesn't have this issue. A VF and a 82576 PF assigned to a guest always work well.
I don't know the order of NIC assignment in qemu-system-x86_64 will make this difference in guest. Maybe it's a qemu bug ?

Reproduce steps:
----------------
1.pci-stub the two NIC
2.create a guest: qemu-system-x86_64 -m 1024 -smp 2 -device
pci-assign,host=0c:00.0 -device pci-assign,host=0e:00.1 -net none -hda
qcow-rhel6.img
(0c:00.0 is a 82576 NIC, and 0e:00.1 is another NIC).

Current result:
----------------
82576 PF and the other NIC in guest will work well regardless of what the order of assignment is.

Expected result:
----------------

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

For both working and non-working cases, please provide:
  - guest dmesg
  - lspci -vvv for each NIC in the guest
  - lspci -vvv for each NIC in the host

Also, please test using a fixed address for the device, ex. -device pci-assign,host=0e:00.1,addr=3 If you reverse the command line order, but keep the address of each device the same, does the problem still occur? If you keep the command line order the same, but swap the addresses of each device, does the problem still occur?

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

dmesg and lspci logs are attached as a tar package. There are many logs in it, please read README.txt first, which will tell you which log is in which case.

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

Without fixing address of the device, the first device will get 00:03.0 in my guest, and the second get 00:04.0.
And when NIC 82576 is in the first order, it will get 00:03.0 in my guest, and it will not working.
If I fix 82576 NIC using "addr=0x3" and other using "addr=0x4", 82576 NIC in guest will get 00:03.0 and the other will get 04:00.0.
If fixing address, the assignment order doestn't matter the result. With "addr=0x3", NIC 82576 always works well in guest; but with "addr=0x4", 82580 or some other NIC will work well in my guest.

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

It looks like the critical difference in the working vs non-working case is the host interrupts on the 82576. When working, both the host and guest view of the device is using MSI-X interrupts. When not working, the host is using MSI but the guest still thinks it's using MSI-X. Those are not compatible. I'm curious if this is already fixed by qemu-kvm.git commit 096392ef which intends to prevent an update of the chipset irq routing registers from causing the legacy interrupt to be re-registered (which can legally use MSI) and it's stomping on the MSI-X setup. Please retest with latest bits (note that I just sent a couple patches to the mailing list that unbreak MSI-X support which you'll need to make this work).

Revision history for this message
Yongjie Ren (yongjie-ren) wrote :

Yes, this bug is got fixed.

Changed in qemu:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.