[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Interrupt issues with hvm_emulate_one_vm_event()



On 05/26/17 18:11, Andrew Cooper wrote:
> On 26/05/17 15:29, Jan Beulich wrote:
>>>>> On 25.05.17 at 11:40, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>> I've noticed that, with pages marked NX and vm_event emulation, we can
>>> end up emulating an ud2, for which hvm_emulate_one() returns
>>> X86EMUL_EXCEPTION in hvm_emulate_one_vm_event().
>> Could you explain what would lead to emulation of UD2?
>>
>>> This, in turn, causes a hvm_inject_event() call in the context of
>>> hvm_do_resume(), which can, if there's already a pending event there,
>>> cause a 101 BSOD (timer-related, if I understand correctly) or loss of
>>> input (mouse frozen, keyboard unresponsive).
>>>
>>> After much trial and error, I've been able to confirm this by leaving a
>>> guest on for almost a full day with this change:
>>>
>>>      case X86EMUL_EXCEPTION:
>>> -        hvm_inject_event(&ctx.ctxt.event);
>>> +        if ( !hvm_event_pending(current) )
>>> +            hvm_inject_event(&ctx.ctxt.event);
>>>
>>> and checking that there's been no BSOD or loss of input.
>>>
>>> However, just losing the event here, while fine to prove that this is
>>> indeed the problem, is not OK. But I'm not sure what an elegant / robust
>>> way of fixing this is.
>> Much depends on what the other event is: If it's an interrupt, I'd
>> assume there to be an ordering problem (interrupts shouldn't be
>> injected when there is a pending exception, their delivery instead
>> should be attempted on the first instruction of the exception
>> handler [if interrupts remain on] or whenever interrupts get
>> re-enabled).
> 
> I suspect it is an ordering issue, and something has processed and
> interrupt before the emulation occurs as part of the vm_event reply happens.
> 
> The interrupt ordering spec indicates that external interrupts take
> precedent over faults raised from executing an instruction, on the basis
> that once the interrupt handler returns, the instruction will generate
> the same fault again.  However, its not obvious how this is intended to
> interact with interrupt windows and vmexits.  I expect we can get away
> with ensuring that external interrupts are the final thing considered
> for injection on the return-to-guest path.
> 
> It might be an idea to leave an assert in vmx_inject_event() that an
> event is not already pending, but in the short term, this probably also
> wants debugging by trying to identify what sequence of actions is
> leading us to inject two events in this case (if indeed this is what is
> happening).

With some patience, I've been able to catch the problem: "(XEN)
vmx_inject_event(3, 14) but 0, 225 pending".

 63 /*
 64  * x86 event types. This enumeration is valid for:
 65  *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
 66  *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
 67  */
 68 enum x86_event_type {
 69     X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
 70     X86_EVENTTYPE_NMI = 2,          /* NMI */
 71     X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
 72     X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
 73     X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
 74     X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
 75 };

so an X86_EVENTTYPE_EXT_INTR is pending when we're trying to inject an
X86_EVENTTYPE_HW_EXCEPTION, as a result of the code quoted above.

This is the patch that has produced the above output:

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c8ef18a..3b88eca 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1807,6 +1807,12 @@ void vmx_inject_nmi(void)
                            X86_EVENT_NO_EC);
 }

+static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
+{
+    info->cr2 = v->arch.hvm_vcpu.guest_cr[2];
+    return hvm_funcs.get_pending_event(v, info);
+}
+
 /*
  * Generate a virtual event in the guest.
  * NOTES:
@@ -1821,6 +1827,15 @@ static void vmx_inject_event(const struct
x86_event *event)
     struct vcpu *curr = current;
     struct x86_event _event = *event;

+    if ( hvm_event_pending(current) )
+    {
+       struct x86_event ev;
+       hvm_get_pending_event(current, &ev);
+
+       printk("vmx_inject_event(%d, %d) but %d, %d pending\n",
+              _event.type, _event.vector, ev.type, ev.vector);
+    }
+
     switch ( _event.vector | -(_event.type == X86_EVENTTYPE_SW_INTERRUPT) )
     {
     case TRAP_debug:


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.