[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] xc_hvm_inject_trap() races
>>> On 07.11.16 at 12:49, <rcojocaru@xxxxxxxxxxxxxxx> wrote: > On 11/07/2016 11:57 AM, Jan Beulich wrote: >>>>> On 07.11.16 at 10:37, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>> The problem there is that vmx_idtv_reinject() is a corner case: it >>> writes VM_ENTRY_INTR_INFO directly, and this can happen _before_ the >>> guest exits with, let's say, an EPT vm_event. >>> >>> In that case, the guest has exited, there's already an interrupt >>> pending, and while the VCPU is paused waiting for the vm_event reply we >>> request an injection that effectively obliterates the pending interrupt. >>> >>> Going the singlestep way is satisfactory for us for most cases, but it >>> still leaves the described corner case. The only fix proposal we could >>> think of is to, instead of vmx_idtv_reinject() doing the work, simply >>> set some flags, and have a later function do the actual work, in >>> vmx_intr_assist() style. >>> >>> I hope I've been able to make this clearer (and I haven't misunderstood >>> something in the process :) ). >> >> You did, thanks. Yet I continue to remain on my earlier position: It's >> the non-architectural (injected) event which should get deferred, >> not architectural ones (either the variant you describe above, or >> any which hypervisor processing found necessary to deliver). The >> single step case can be viewed as an exception here, albeit ideally >> it also wouldn't need to be special cased. > > But the problem is that if we defer, say a #PF, by the time there's a VM > entry with no pending interrupt we may get the wrong context (wrong > address space, guest mode, etc.). > > The simple solution to this would be to save the context (for a #PF this > would additionally mean CR3 as well), and only inject at the first > opportunity where the context matches. > > However, this gets ugly quickly: for one, the right context for a #PF > may occur on a different VCPU, so right off the bat we can't hold this > information per-VCPU anymore (it would need to be per-domain). > > Second, it is conceivable that there will be several injection requests > not yet delivered, and in that case we need to save the context for all > of them (so an array or some sort of container of contexts becomes > necessary). > > And third, there's no guarantee that the guest will ever reach the > proper context for injecting any of the deferred interrupts for the > duration of it's life, which brings us back to the problem we were > trying to solve: we still can't guarantee trap delivery, but now in a > much more complicated manner. :) > > We haven't even discussed what to do if a new request comes when the > context buffer is full (we'd need to lose an undelivered trap), or that > different trap types may require different contexts and handling logic. All agreed, yet all are issues for you to solve in order to be able to half way sanely inject a non-architectural fault. (And btw., CR3 as context may not suffice - you may also need e.g. the altp2m index in use.) > At this point it looks like the best solution is to simply discard the > non-architectural event if there's a pending architectural one, and add > a new vm_event, as suggested by Tamas, that notifies the application > that a trap has failed, or succeeded, and let it do the best it can with > that information. Well, if the two of you think this is something which fits into the VM event model, then I guess that's an option. I, for one, am not convinced: It simply seems too special purpose. If this was a more generic event ("interruption delivered", perhaps with a type or vector qualifier) that can be subscribed to, and the app did that only for such transient periods of time, this would look more reasonable to me. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |