[Oneiric] Xen HVM: transmit timeouts on emulated 8139cp device

Bug #854829 reported by Stefan Bader on 2011-09-20
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xen (Ubuntu)
Medium
Chuck Short
Oneiric
Medium
Chuck Short
Precise
Undecided
Unassigned

Bug Description

Using an Oneiric dom0 and an Oneiric HVM domU, the network comes initially up, DHCP succeeds and the interface has an IP address assigned. Pings usually are successful. But any higher traffic (like doing an apt-get update) causes the network to hang. After a while the net device watchdog triggers and resets the connection. In dmesg there are also messages like this:

8139cp: 0000:00:04.0: eth0: Transmit timeout, status d 3b 5 80ff

Stefan Bader (smb) on 2011-09-20
Changed in xen (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
importance: Undecided → Medium
status: New → Confirmed
Stefan Bader (smb) wrote :

Testing various emulated devices shows:
- ne2k_pci gets the tx problems almost immediatly
- 8139cp on any slightly higher traffic
- e1000 seems to work

When pirq's are not used, 8139cp also works again. So there seems to be a subtle bug in the delivery of those.

Dave Walker (davewalker) on 2011-09-27
tags: added: server-o-ro
Stefan Bader (smb) wrote :
Download full text (3.3 KiB)

Replicating some information that was sent to the mailing list:

It took quite a bit of time but at least I got some hopefully useful information
now. So in general, whenever an interrupt is asserted,
the hypervisor runs through this:

__hvm_pci_intx_assert:
  when assert count was 0 before incrementing
    call assert_gsi
      call send_guest_pirq (when hvm uses pirq)

In the send_guest_pirq chain is a call to evtchn_set_pending which tests as one
of the first actions whether evtchn_pending in the shared_info is set. If that
is the case the call immediately returns with 1.

Adding printks to call_assert_gsi, I noticed that
- When things stop working, the last call to send_guest_pirq returned 1.
- But not every time the return code is one, the stall happens.
- e1000 also has cases where send_guest_pirq returns 1 but they happen much
  less often (than using the 8139cp).

Usually every intx_assert has a intx_deassert call that follows. when the stall
occurs, this does not happen. Right here I got some troubles to understand where
this intx_deassert is actually triggered. With an added WARN_ON the stack traces
seem odd, like this:

(XEN) [<ffff82c4801abd9c>] __hvm_pci_intx_deassert+0x6c/0x130
(XEN) [<ffff82c4801ac43e>] hvm_pci_intx_deassert+0x3e/0x60
(XEN) [<ffff82c4801a8148>] do_hvm_op+0x3b8/0x1e60
(XEN) [<ffff82c480168ea1>] do_update_descriptor+0x171/0x220
(XEN) [<ffff82c48017dba6>] copy_from_user+0x26/0x90
(XEN) [<ffff82c4801f9446>] do_iret+0xb6/0x1a0
(XEN) [<ffff82c4801f4f28>] syscall_enter+0x88/0x8d

Not really sure how one gets from do_update_descriptor to do_hvm_op and the only
thing in there which does the deassert is some irq level setting.

Actually the guest does not really do much do EOI (which I had been assuming).
But since domain_pirq_to_irq maps to 0 for emuirqs, the call to
PHYSDEVOP_irq_status_query will hit the following and not set the flag for
needing EOI.

        irq_status_query.flags = 0;
        if ( is_hvm_domain(v->domain) &&
             domain_pirq_to_irq(v->domain, irq) <= 0 )
        {
            ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
            break;
        }

So all the guest is doing is to clear evtchn_pending in the pirq EOI function. I
fail to understand what actually is doing the hvm_pci_intx_deassert calls but
the way the fasteoi code in the guest looks to be working, there seems to be
some gap between calling the handler and the eoi function... So from what I see,
I would assume the following:

dom0 domU
- intx_assert (count 0->1)
- send_guest_pirq = 0
  (evtchn_pending = 1)
                                         - upcall starts fasteoi handler
- something does intx_deassert
  (count 1->0)
- intx_assert (count 0->1)
- send_guest_pirq = 1
  (evtchn_pending still set)
                                         - handler->eoi sets evtchn to 0 but
                                           otherwise does nothing
- there is no intx_deassert, so even
  when another intx_assert would happen
  (which does not seem to be the case)
  no further send_guest_pirq would be
  called.

Unfortunately I do miss some details on the inner wo...

Read more...

Stefan Bader (smb) wrote :

Stefano Stabellini replied:

Thanks for the very detailed analysis. It seems to me that the problem is that if the interrupt is a level
triggered interrupt when the guest issues an EOI we should be reinjecting the interrupt again if it has been issued a second time in the meantime. However this doesn't happen if the interrupt has been remapped onto an even channel. In that case the guest is not even going to issue an EOI at all.
So I wrote a patch to force the guest to issue EOIs even on remapped irqs; in the hypercall handler we check whether we need to reinject the interrupt and if that is the case we set the corresponding event channel pending.
Could you please try the patch I appended? I haven't been able to reproduce your problem so I am not really sure if it works.

diff -r e042fb60e0ee xen/arch/x86/physdev.c
--- a/xen/arch/x86/physdev.c Thu Sep 29 11:23:01 2011 +0000
+++ b/xen/arch/x86/physdev.c Fri Sep 30 14:01:46 2011 +0000
@@ -11,6 +11,7 @@
 #include <asm/current.h>
 #include <asm/io_apic.h>
 #include <asm/msi.h>
+#include <asm/hvm/irq.h>
 #include <asm/hypercall.h>
 #include <public/xen.h>
 #include <public/physdev.h>
@@ -270,6 +271,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
         if ( !is_hvm_domain(v->domain) ||
              domain_pirq_to_irq(v->domain, eoi.irq) > 0 )
             pirq_guest_eoi(pirq);
+ if ( is_hvm_domain(v->domain) &&
+ domain_pirq_to_emuirq(v->domain, eoi.irq) > 0 )
+ {
+ struct hvm_irq *hvm_irq = &v->domain->arch.hvm_domain.irq;
+ int gsi = domain_pirq_to_emuirq(v->domain, eoi.irq);
+
+ /* if this is a level irq and count > 0, send another
+ * notification */
+ if ( gsi >= NR_ISAIRQS /* ISA irqs are edge triggered */
+ && hvm_irq->gsi_assert_count[gsi] )
+ send_guest_pirq(v->domain, pirq);
+ }
         spin_unlock(&v->domain->event_lock);
         ret = 0;
         break;
@@ -327,12 +340,6 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
         if ( (irq < 0) || (irq >= v->domain->nr_pirqs) )
             break;
         irq_status_query.flags = 0;
- if ( is_hvm_domain(v->domain) &&
- domain_pirq_to_irq(v->domain, irq) <= 0 )
- {
- ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
- break;
- }

         /*
          * Even edge-triggered or message-based IRQs can need masking from

Stefan Bader (smb) wrote :

The patch needed some changes because the code we use in Oneiric is 4.1.1 and is different here. Tested the adapted patch and did not see the network stalls anymore.

Stefan Bader (smb) wrote :

This is the (tested) minimal fix for the xen 4.1.1 hypervisor. The problem with the more complete patch against 4.2 is that it needs to keep the number of "lost" interrupts per pirq. It does that by adding an element to the pirq struct (which does not exist in 4.1.1). So for the backport of that one would need to find a good place to keep those counters. Not sure whether for example the domain info could be extended without causing some bad interaction. I may try that or we just go with the minimal approach (which does not fix a related gap in handling passthrough devices). At least that fixes the more obvious issues.

The attachment "Minimal fix against Xen 4.1.1" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Stefan Bader (smb) wrote :

For completeness this is the backported version of the Xen 4.2 patch. But since it is more complex, the recommendation would be to go with the first (simpler) patch.

Stefan Bader (smb) on 2011-10-06
Changed in xen (Ubuntu):
assignee: Stefan Bader (stefan-bader-canonical) → Chuck Short (zulcss)
status: Confirmed → Triaged

Hello Stefan, or anyone else affected,

Accepted xen into oneiric-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in xen (Ubuntu Oneiric):
status: Triaged → Fix Committed
tags: added: verification-needed
Stefan Bader (smb) wrote :

Tested the proposed package and it does give a working 8139cp connection in HVM mode.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xen - 4.1.1-2ubuntu4.1

---------------
xen (4.1.1-2ubuntu4.1) oneiric-proposed; urgency=low

  * debian/patches/xen-pirq-resubmit-irq.patch: Retrigger
    pirq events when asserted while processing. Thanks to Stefan Bader
    (LP: #854829)
 -- Chuck Short <email address hidden> Mon, 10 Oct 2011 19:30:09 -0400

Changed in xen (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Changed in xen (Ubuntu Precise):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers