Replicating some information that was sent to the mailing list:
It took quite a bit of time but at least I got some hopefully useful information
now. So in general, whenever an interrupt is asserted,
the hypervisor runs through this:
__hvm_pci_intx_assert:
when assert count was 0 before incrementing
call assert_gsi
call send_guest_pirq (when hvm uses pirq)
In the send_guest_pirq chain is a call to evtchn_set_pending which tests as one
of the first actions whether evtchn_pending in the shared_info is set. If that
is the case the call immediately returns with 1.
Adding printks to call_assert_gsi, I noticed that
- When things stop working, the last call to send_guest_pirq returned 1.
- But not every time the return code is one, the stall happens.
- e1000 also has cases where send_guest_pirq returns 1 but they happen much
less often (than using the 8139cp).
Usually every intx_assert has a intx_deassert call that follows. when the stall
occurs, this does not happen. Right here I got some troubles to understand where
this intx_deassert is actually triggered. With an added WARN_ON the stack traces
seem odd, like this:
Not really sure how one gets from do_update_descriptor to do_hvm_op and the only
thing in there which does the deassert is some irq level setting.
Actually the guest does not really do much do EOI (which I had been assuming).
But since domain_pirq_to_irq maps to 0 for emuirqs, the call to
PHYSDEVOP_irq_status_query will hit the following and not set the flag for
needing EOI.
So all the guest is doing is to clear evtchn_pending in the pirq EOI function. I
fail to understand what actually is doing the hvm_pci_intx_deassert calls but
the way the fasteoi code in the guest looks to be working, there seems to be
some gap between calling the handler and the eoi function... So from what I see,
I would assume the following:
dom0 domU
- intx_assert (count 0->1)
- send_guest_pirq = 0
(evtchn_pending = 1) - upcall starts fasteoi handler
- something does intx_deassert
(count 1->0)
- intx_assert (count 0->1)
- send_guest_pirq = 1
(evtchn_pending still set) - handler->eoi sets evtchn to 0 but otherwise does nothing
- there is no intx_deassert, so even
when another intx_assert would happen
(which does not seem to be the case)
no further send_guest_pirq would be
called.
Unfortunately I do miss some details on the inner working here. Generally I
wonder whether not setting the needsEOI flag for those pirqs just is the
problem. But it also could be intentional...
Replicating some information that was sent to the mailing list:
It took quite a bit of time but at least I got some hopefully useful information
now. So in general, whenever an interrupt is asserted,
the hypervisor runs through this:
__hvm_pci_ intx_assert:
when assert count was 0 before incrementing
call assert_gsi
call send_guest_pirq (when hvm uses pirq)
In the send_guest_pirq chain is a call to evtchn_set_pending which tests as one
of the first actions whether evtchn_pending in the shared_info is set. If that
is the case the call immediately returns with 1.
Adding printks to call_assert_gsi, I noticed that
- When things stop working, the last call to send_guest_pirq returned 1.
- But not every time the return code is one, the stall happens.
- e1000 also has cases where send_guest_pirq returns 1 but they happen much
less often (than using the 8139cp).
Usually every intx_assert has a intx_deassert call that follows. when the stall
occurs, this does not happen. Right here I got some troubles to understand where
this intx_deassert is actually triggered. With an added WARN_ON the stack traces
seem odd, like this:
(XEN) [<ffff82c4801ab d9c>] __hvm_pci_ intx_deassert+ 0x6c/0x130 43e>] hvm_pci_ intx_deassert+ 0x3e/0x60 148>] do_hvm_ op+0x3b8/ 0x1e60 ea1>] do_update_ descriptor+ 0x171/0x220 ba6>] copy_from_ user+0x26/ 0x90 446>] do_iret+0xb6/0x1a0 f28>] syscall_ enter+0x88/ 0x8d
(XEN) [<ffff82c4801ac
(XEN) [<ffff82c4801a8
(XEN) [<ffff82c480168
(XEN) [<ffff82c48017d
(XEN) [<ffff82c4801f9
(XEN) [<ffff82c4801f4
Not really sure how one gets from do_update_ descriptor to do_hvm_op and the only
thing in there which does the deassert is some irq level setting.
Actually the guest does not really do much do EOI (which I had been assuming). irq_status_ query will hit the following and not set the flag for
But since domain_pirq_to_irq maps to 0 for emuirqs, the call to
PHYSDEVOP_
needing EOI.
if ( is_hvm_
{
ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
break;
}
So all the guest is doing is to clear evtchn_pending in the pirq EOI function. I intx_deassert calls but
fail to understand what actually is doing the hvm_pci_
the way the fasteoi code in the guest looks to be working, there seems to be
some gap between calling the handler and the eoi function... So from what I see,
I would assume the following:
dom0 domU
- upcall starts fasteoi handler
- handler->eoi sets evtchn to 0 but
otherwise does nothing
- intx_assert (count 0->1)
- send_guest_pirq = 0
(evtchn_pending = 1)
- something does intx_deassert
(count 1->0)
- intx_assert (count 0->1)
- send_guest_pirq = 1
(evtchn_pending still set)
- there is no intx_deassert, so even
when another intx_assert would happen
(which does not seem to be the case)
no further send_guest_pirq would be
called.
Unfortunately I do miss some details on the inner working here. Generally I
wonder whether not setting the needsEOI flag for those pirqs just is the
problem. But it also could be intentional...