[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling




> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx]
> Sent: Wednesday, February 10, 2016 8:36 PM
> To: Wu, Feng <feng.wu@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx
> Cc: Keir Fraser <keir@xxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Andrew
> Cooper <andrew.cooper3@xxxxxxxxxx>; Tian, Kevin <kevin.tian@xxxxxxxxx>;
> George Dunlap <george.dunlap@xxxxxxxxxxxxx>; Dario Faggioli
> <dario.faggioli@xxxxxxxxxx>
> Subject: Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> On 28/01/16 05:12, Feng Wu wrote:
> > This is the core logic handling for VT-d posted-interrupts. Basically it
> > deals with how and when to update posted-interrupts during the following
> > scenarios:
> > - vCPU is preempted
> > - vCPU is slept
> > - vCPU is blocked
> >
> > When vCPU is preempted/slept, we update the posted-interrupts during
> > scheduling by introducing two new architecutral scheduler hooks:
> > vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
> > introduce a new architectural hook: arch_vcpu_block() to update
> > posted-interrupts descriptor.
> >
> > Besides that, before VM-entry, we will make sure the 'NV' filed is set
> > to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
> > is needed when vCPU is running in non-root mode. The reason we do this
> check
> > is because we change the posted-interrupts descriptor in vcpu_block(),
> > however, we don't change it back in vcpu_unblock() or when vcpu_block()
> > directly returns due to event delivery (in fact, we don't need to do it
> > in the two places, that is why we do it before VM-Entry).
> >
> > When we handle the lazy context switch for the following two scenarios:
> > - Preempted by a tasklet, which uses in an idle context.
> > - the prev vcpu is in offline and no new available vcpus in run queue.
> > We don't change the 'SN' bit in posted-interrupt descriptor, this
> > may incur spurious PI notification events, but since PI notification
> > event is only sent when 'ON' is clear, and once the PI notificatoin
> > is sent, ON is set by hardware, hence no more notification events
> > before 'ON' is clear. Besides that, spurious PI notification events are
> > going to happen from time to time in Xen hypervisor, such as, when
> > guests trap to Xen and PI notification event happens, there is
> > nothing Xen actually needs to do about it, the interrupts will be
> > delivered to guest atht the next time we do a VMENTRY.
> >
> > CC: Keir Fraser <keir@xxxxxxx>
> > CC: Jan Beulich <jbeulich@xxxxxxxx>
> > CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > CC: Kevin Tian <kevin.tian@xxxxxxxxx>
> > CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> > CC: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> > Suggested-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> > Suggested-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
> > Suggested-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> > Suggested-by: Jan Beulich <jbeulich@xxxxxxxx>
> > Signed-off-by: Feng Wu <feng.wu@xxxxxxxxx>
> 
> Feng,
> 
> Thanks for your work on this.
> 
> Coming back to this thread after 5 months, what strikes me first of all
> is that it would be good to have a comment somewhere laying out exactly
> all the things that need to change for the different runstates with
> posted interrupts, so that someone later trying to change things has an
> overview of what invariants need to be kept.
> 
> What do you think about adding the following comment somewhere near the
> PI callbacks? (Corrected for accuracy of course.)
> 
> ---
> To handle posted interrupts correctly, we need to set the following state:
> 
> * The PI notification vector (NV)
> * The PI notification destination processor (NDST)
> * The PI "suppress notification" bit (SN)
> * The vcpu pi "blocked" list
> 
> If a VM is currently running, we want the PI delivered to the guest vcpu
> on the proper pcpu (NDST = v->processor, SN clear).
> 
> If the vm is blocked, we want the PI delivered to Xen so that it can
> wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
> 
> If the VM is currently either preempted or offline (i.e., not running
> because of some reason other than blocking waiting for an interrupt),
> there's nothing Xen can do -- we want the interrupt pending bit set in
> the guest, but we don't want to bother Xen with an interrupt (SN clear).
> 
> There's a brief window of time between vmx_intr_assist() and checking
> softirqs where if an interrupt comes in it may be lost; so we need Xen
> to get an interrupt and raise a softirq so that it will go through the
> vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
> 
> The way we implement this now is by looking at what needs to happen on
> the following runstate transitions:
> 
> A: runnable -> running
>  - SN = 0
>  - NDST = v->processor
> B: running -> runnable
>  - SN = 1
> C: running -> blocked
>  - NV = pi_wakeup_vector
>  - Add vcpu to blocked list
> D: blocked -> runnable
> - NV = posted_intr_vector
> - Take vcpu off blocked list
> 
> For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to} paths.
> 
> For transition C, we add a new arch hook, arch_vcpu_block(), which is
> called from vcpu_block() and vcpu_do_poll().
> 
> For transition D, rather than add an extra arch hook on vcpu_wake, we
> add a hook on the vmentry path which checks to see if either of the two
> actions need to be taken.
> 
> These hooks only need to be called when the domain in question actually
> has a physical device assigned to it, so we set and clear the callbacks
> as appropriate when device assignment changes.
> ---
> 
> Is that about right?

Perfect summary, I will add them. Thanks a lot, George!

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.