Xen project Mailing List

Re: [Xen-devel] Emulation and active (valid) interrupts

To: "Razvan Cojocaru" <rcojocaru@xxxxxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Tue, 14 Aug 2018 01:19:06 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 14 Aug 2018 07:19:23 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 13.08.18 at 21:17, <rcojocaru@xxxxxxxxxxxxxxx> wrote: > On 8/13/18 4:45 PM, Razvan Cojocaru wrote: >> On 8/13/18 4:38 PM, Jan Beulich wrote: >>>>>> On 13.08.18 at 15:19, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>> On 8/13/18 3:58 PM, Jan Beulich wrote: >>>>>>>> On 13.08.18 at 14:51, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>>> So first we've got that vmx_idtv_reinject() call writing to the VMCS, >>>>>> then we emulate a CLI, then the failed vmentry. I can't tell if the CLI >>>>>> ran first and then an interrupt popped up, or if an interrupt had >>>>>> already been __vmwrit()ten and then CLI caused the invalid guest state. >>>>> >>>>> I'd expect it to be the latter - an external interrupt presumably >>>>> can't be injected when EFLAGS.IF is clear. Why are we emulating >>>>> CLI in the first place? With a pending external interrupt, shouldn't >>>>> we just exit back to guest context without emulating anything? >>>> >>>> In this particular case we're emulating CLI because the vm_event >>>> response requests it. >>>> >>>> Tamas' test marks all of the guest's pages XENMEM_access_x, and at some >>>> point a vm_event arrives somewhere in a page where CLI is read from, >>>> AFAICT. Doing nothing would get us into an infinite loop, and since we >>>> don't want to mark the page rwx, we try to emulate CLI. >>> >>> Doing nothing would get you into an infinite loop only if at each >>> attempt there's yet again an event to be re-injected. Of course >>> the risk of this grows the longer it takes to processes things in >>> your tool, but if there is an event to be re-injected then I don't >>> see what else you can do. Trying to ditch the event would >>> certainly be the wrong thing. I suggest you try to get advice >>> from the VMX maintainers - perhaps I'm simply overlooking an >>> obvious route out of the state you're apparently in. >> >> [Missed hitting "Reply all" - sorry, and re-sent. Also, added Jun and >> Kevin to the conversation.] >> >> You're of course right, what I meant to say was that if we don't >> emulate, don't mark the page rwx, and don't move RIP we'll be in an >> infinite loop of read-caused vm_event -> userspace tool gets event -> >> does nothing, but responds to it -> guest resumes at the same RIP >> (pointing, in this case at CLI, but it could be anything) -> goto begin. >> >> We need to do something to keep the guest going, and the generic way to >> accomplish this is to ask Xen to emulate whatever instruction is at RIP >> (because the Xen emulator, at least for the time being, ignores EPT >> restrictions). > > On top of everything, there's also a basic design problem: the way the > code is written now: > > 1. The "inject events" code seems to be advertised as living in intr.c - > but here's an exception to the rule with vmx_idtv_reinject() living in > vmx.c. But "re-inject" != "inject". > 2. The single-step code implies that once we have vmx_intr_assist() > return, event injection is blocked: > > 234 /* Block event injection when single step with MTF. */ > 235 if ( unlikely(v->arch.hvm_vcpu.single_step) ) > 236 { > 237 v->arch.hvm_vmx.exec_control |= CPU_BASED_MONITOR_TRAP_FLAG; > 238 vmx_update_cpu_exec_control(v); > 239 return; > 240 } > > an assumption that has turned out to be false. As far as I'm aware it is well known that MTF handling isn't the greatest. > 3. Obviously the idea of injecting something just before taking an exit, > for example caused by EXIT_REASON_EPT_VIOLATION is not natural to us. Indeed, yet so far I've not seen a summary of all the conditions under which you see this happening. Remember that an EPT violation with valid IDT vectoring information means the violation has occurred _while_ delivering an event. It is my understanding that this can only occur if the EPT violation happens for an IDT, GDT, TSS, or stack access. In particular the instruction pointed at does not matter here at all. I therefore wonder whether either you're removing permissions too aggressively, or whether the state information passed to the tools side handling code of the VM event is insufficient to recognize that instruction emulation must not be attempted, and instead actions need to be taken to make it possible for the pending event to be delivered without incurring another EPT violation. Note that single stepping is as little of an option as insn emulation in this case. If anything, the event delivery would need emulating (for which there is no code at all in the hypervisor, iirc). > Furthermore, the way the code is written now, _first_ we call > vmx_idtv_reinject() and only then do we handle EXIT_REASON_EPT_VIOLATION I'm afraid if this was done in the opposite order, nothing would change for you: The to-be-re-injected event would still need re-injecting, and hence you still couldn't inject an event of your liking (or allow e.g. CLI to be emulated). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.