[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Thursday, March 05, 2015 3:13 PM
> To: Wu, Feng
> Cc: Tian, Kevin; Zhang, Yang Z; xen-devel@xxxxxxxxxxxxx
> Subject: RE: VT-d Posted-interrupt (PI) design for XEN
> 
> >>> On 05.03.15 at 06:04, <feng.wu@xxxxxxxxx> wrote:
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: Wednesday, March 04, 2015 11:19 PM
> >> >>> On 04.03.15 at 14:30, <feng.wu@xxxxxxxxx> wrote:
> >> > - Introduce a new global vector which is used to wake up the HLT'ed vCPU.
> >> > Currently, there is a global vector 'posted_intr_vector', which is used 
> >> > as
> >> > the
> >> > global notification vector for all vCPUs in the system. This vector is
> >> > stored in
> >> > VMCS and CPU considers it as a special vector, uses it to notify the 
> >> > related
> >> > pCPU when an interrupt is recorded in the posted-interrupt descriptor.
> >> >
> >> > After having VT-d PI, VT-d engine can issue notification event when the
> >> > assigned devices issue interrupts. We need add a new global vector to
> >> > wakeup the HLT'ed vCPU, please refer to the following scenario for the
> >> > usage of this new global vector:
> >> >
> >> > 1. vCPU0 is running on pCPU0
> >> > 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0
> >> > 3. An external interrupt from an assigned device occurs for vCPU0, if we
> >> > still use 'posted_intr_vector' as the notification vector for vCPU0, the
> >> > notification event for vCPU0 (the event will go to pCPU1) will be 
> >> > consumed
> >> > by vCPU1 incorrectly. The worst case is that vCPU0 will never be woken up
> >> > again since the wakeup event for it is always consumed by other vCPUs
> >> > incorrectly. So we need introduce another global vector, naming
> >> > 'pi_wakeup_vector'
> >> > to wake up the HTL'ed vCPU.
> >>
> >> I'm afraid you describe a particular scenario here, but I don't see
> >> how this is related to the introduction of another global vector:
> >> Either the current (global) vector is sufficient, or another global
> >> vector also can't solve your problem. I'm sure I'm missing something
> >> here, so please be explicit.
> >>
> >
> > In fact, the new global vector is used for the above scenario. Let me
> > explain this a bit more:
> >
> > After having VT-d PI, when an external interrupt from an assigned device
> > happens,
> > here is the hardware processing flow:
> >
> > 1. Interrupts happen.
> > 2. Find the associated IRTE.
> > 3. Find the destination vCPU from IRTE (from Posted-interrupt descriptor
> > address)
> > 4. Sync the interrupt (stored in IRTE as 'virtual vector') to PIRR fields in
> > Posted-interrupt descriptor.
> > 5. If needed (Please refer to the VT-d Spec about the condition of issuing
> > Notification Event),
> > issue notification event to the destination CPU which is store in
> > posted-interrupt descriptor as 'NDST'
> >
> > Back to the above scenario:
> > 1. vCPU0 is running in pCPU0, and the 'NDST' filed of vCPU0's
> > posted-interrupt descriptor is pCPU0
> > 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0.
> > 3. An external interrupt from an assigned device happens, the destination of
> > this interrupt will be
> > determined as above flow (IRTE --> posted-interrupt descriptor address/vCPU
> -->
> > notification event to 'NDST'),
> > If this external interrupt is for vCPU0, the notification event will be
> > delivered to pCPU0 since the 'NDST' field
> > of vCPU0's posted-interrupt descriptor is pCPU0. if we use the current
> > (global) vector for the notification event
> > for vCPU0 in the above case, since the current global vector (notification
> > vector) is a particular vector to CPU,
> > vCPU1 will consume it while vCPU1 is currently running on pCPU0, so we
> > failed to wake up the HLT'ed vCPU0.
> >
> > please refer to Section 29.6 in the Intel SDM about how CPU handles this
> > particular vector:
> >
> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
> 4-ia-32-ar
> > chitectures-software-developer-manual-325462.pdf
> >
> > After introducing a new global vector naming 'pi_wakeup_vector', before
> vCPU
> > is being HLT'ed, we set
> > The 'NV' filed (Notification Vector) in the vCPU's posted-interrupt
> > descriptor to 'pi_wakeup_vector', and
> > this is a normal vector to CPU and CPU will not do special things for it
> > (different from the current global vector).
> > In the handler of this vector, we can wake up the HLT'ed vCPU.
> 
> So suppose you have more than on vCPU which most recently ran on
> pCPU0 - how will the handler for the new vector know which of the
> vCPU-s it should kick? 

Oh, sorry, I thought I had added how the wakeup the HLT'ed vCPU in this design,
Seems I missed it. Here is it.

1. Define a per-cpu list 'blocked_vcpu_on_cpu_lock', which stored the blocked
vCPU on the pCPU.
2. When the vCPU's state is changed to RUNSTATE_blocked, insert the vCPU
to the per-cpu list belonging to the pCPU it was running
3. When the vCPU is unblocked, remove the vCPU from the related pCPU list.

In the handler of 'pi_wakeup_vector', we do:
1. Get the physical CPU.
2. Iterate the list 'blocked_vcpu_on_cpu_lock' of the current pCPU, if 'ON' is 
set,
we unblock the associated vCPU.

> And if it can know, why couldn't the handler for
> posted_intr_vector not know either (i.e. after introducing a specific
> handler for it in place of the currently used event_check_interrupt)?

Come back to the above scenario, vCPU1 is running on pCPU0 while vCPU0
is blocked, if we still use posted_intr_vector for the blocked vCPU0. If vCPU1
is running in non-root mode and external interrupts happen for it, the 
notification
event will be handled by CPU hardware (in non-root mode) automatically,
then we cannot get any control in the handler for posted_intr_vector.

Thanks,
Feng

> (One of the reasons I'm asking, i.e. apart from wanting to
> understand the model, is the limited amount of vectors we have.)
> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.