[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v8 15/17] vmx: VT-d posted-interrupt core logic handling



This patch includes the following aspects:
- Handling logic when vCPU is blocked:
    * Add a global vector to wake up the blocked vCPU
      when an interrupt is being posted to it (This part
      was sugguested by Yang Zhang <yang.z.zhang@xxxxxxxxx>).
    * Define two per-cpu variables:
          1. pi_blocked_vcpu:
            A list storing the vCPUs which were blocked
            on this pCPU.

          2. pi_blocked_vcpu_lock:
            The spinlock to protect pi_blocked_vcpu.

- Add the following hooks, this part was suggested
  by George Dunlap <george.dunlap@xxxxxxxxxxxxx> and
  Dario Faggioli <dario.faggioli@xxxxxxxxxx>.
    * arch_vcpu_block()
      Called alled before vcpu is blocking and update the PID
      (posted-interrupt descriptor).

    * arch_vcpu_block_cancel()
      Called when interrupts come in during blocking.

    * vmx_pi_switch_from()
      Called before context switch, we update the PID when the
      vCPU is preempted or going to sleep.

    * vmx_pi_switch_to()
      Called after context switch, we update the PID when the vCPU
      is going to run.

    * arch_vcpu_wake_prepare()
      It will be called when waking up the vCPU, we update
      the posted interrupt descriptor when the vCPU is
      unblocked.

CC: Keir Fraser <keir@xxxxxxx>
CC: Jan Beulich <jbeulich@xxxxxxxx>
CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CC: Kevin Tian <kevin.tian@xxxxxxxxx>
CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
CC: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Sugguested-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Signed-off-by: Feng Wu <feng.wu@xxxxxxxxx>
---
v8:
- Remove the lazy context switch handling for PI state transition
- Change PI state in vcpu_block() and do_poll() when the vCPU
  is going to be blocked

v7:
- Merge [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted 
interrupts
  and "[PATCH v6 14/18] vmx: posted-interrupt handling when vCPU is blocked"
  into this patch, so it is self-contained and more convenient
  for code review.
- Make 'pi_blocked_vcpu' and 'pi_blocked_vcpu_lock' static
- Coding style
- Use per_cpu() instead of this_cpu() in pi_wakeup_interrupt()
- Move ack_APIC_irq() to the beginning of pi_wakeup_interrupt()
- Rename 'pi_ctxt_switch_from' to 'ctxt_switch_prepare'
- Rename 'pi_ctxt_switch_to' to 'ctxt_switch_cancel'
- Use 'has_hvm_container_vcpu' instead of 'is_hvm_vcpu'
- Use 'spin_lock' and 'spin_unlock' when the interrupt has been
  already disabled.
- Rename arch_vcpu_wake_prepare to vmx_vcpu_wake_prepare
- Define vmx_vcpu_wake_prepare in xen/arch/x86/hvm/hvm.c
- Call .pi_ctxt_switch_to() __context_switch() instead of directly
  calling vmx_post_ctx_switch_pi() in vmx_ctxt_switch_to()
- Make .pi_block_cpu unsigned int
- Use list_del() instead of list_del_init()
- Coding style

One remaining item in v7:
Jan has concern about calling vcpu_unblock() in vmx_pre_ctx_switch_pi(),
need Dario or George's input about this.

v6:
- Add two static inline functions for pi context switch
- Fix typos

v5:
- Rename arch_vcpu_wake to arch_vcpu_wake_prepare
- Make arch_vcpu_wake_prepare() inline for ARM
- Merge the ARM dummy hook with together
- Changes to some code comments
- Leave 'pi_ctxt_switch_from' and 'pi_ctxt_switch_to' NULL if
  PI is disabled or the vCPU is not in HVM
- Coding style

v4:
- Newly added

Changlog for "vmx: posted-interrupt handling when vCPU is blocked"
v6:
- Fix some typos
- Ack the interrupt right after the spin_unlock in pi_wakeup_interrupt()

v4:
- Use local variables in pi_wakeup_interrupt()
- Remove vcpu from the blocked list when pi_desc.on==1, this
- avoid kick vcpu multiple times.
- Remove tasklet

v3:
- This patch is generated by merging the following three patches in v2:
   [RFC v2 09/15] Add a new per-vCPU tasklet to wakeup the blocked vCPU
   [RFC v2 10/15] vmx: Define two per-cpu variables
   [RFC v2 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts
- rename 'vcpu_wakeup_tasklet' to 'pi_vcpu_wakeup_tasklet'
- Move the definition of 'pi_vcpu_wakeup_tasklet' to 'struct arch_vmx_struct'
- rename 'vcpu_wakeup_tasklet_handler' to 'pi_vcpu_wakeup_tasklet_handler'
- Make pi_wakeup_interrupt() static
- Rename 'blocked_vcpu_list' to 'pi_blocked_vcpu_list'
- move 'pi_blocked_vcpu_list' to 'struct arch_vmx_struct'
- Rename 'blocked_vcpu' to 'pi_blocked_vcpu'
- Rename 'blocked_vcpu_lock' to 'pi_blocked_vcpu_lock'

 xen/arch/x86/domain.c              |  12 ++
 xen/arch/x86/hvm/hvm.c             |  18 +++
 xen/arch/x86/hvm/vmx/vmcs.c        |   2 +
 xen/arch/x86/hvm/vmx/vmx.c         | 265 +++++++++++++++++++++++++++++++++++++
 xen/common/schedule.c              |   9 ++
 xen/include/asm-arm/domain.h       |   4 +
 xen/include/asm-x86/domain.h       |   4 +
 xen/include/asm-x86/hvm/hvm.h      |   4 +
 xen/include/asm-x86/hvm/vmx/vmcs.h |  11 ++
 xen/include/asm-x86/hvm/vmx/vmx.h  |   4 +
 10 files changed, 333 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 045f6ff..1d3eb15 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1608,6 +1608,18 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
     if ( (per_cpu(curr_vcpu, cpu) == next) ||
          (is_idle_domain(nextd) && cpu_online(cpu)) )
     {
+        /*
+         * When we handle the lazy context switch for the following
+         * two scenarios:
+         * - Preempted by a tasklet, which uses in an idle context
+         * - the prev vcpu is in offline and no new available vcpus in run 
queue
+         * We don't change the 'SN' bit in posted-interrupt descriptor, this
+         * may incur spurious PI notification events, but since PI notification
+         * event is only sent when 'ON' is clear, and once the PI notificatoin
+         * is sent, ON is set by hardware, so not so many spurious events and
+         * it is not a big deal.
+         */
+
         local_irq_enable();
     }
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c957610..e493e37 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6817,6 +6817,24 @@ bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
     return 0;
 }
 
+void arch_vcpu_block(struct vcpu *v)
+{
+    if ( v->arch.vcpu_block )
+        v->arch.vcpu_block(v);
+}
+
+void arch_vcpu_block_cancel(struct vcpu *v)
+{
+    if ( v->arch.vcpu_block_cancel )
+        v->arch.vcpu_block_cancel(v);
+}
+
+void arch_vcpu_wake_prepare(struct vcpu *v)
+{
+    if ( v->arch.vcpu_wake_prepare )
+        v->arch.vcpu_wake_prepare(v);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 5f67797..5abe960 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -661,6 +661,8 @@ int vmx_cpu_up(void)
     if ( cpu_has_vmx_vpid )
         vpid_sync_all();
 
+    vmx_pi_per_cpu_init(cpu);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index e448b31..cad70b4 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -83,7 +83,206 @@ static int vmx_msr_write_intercept(unsigned int msr, 
uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
 static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
+/*
+ * We maintain a per-CPU linked-list of vCPU, so in PI wakeup handler we
+ * can find which vCPU should be woken up.
+ */
+static DEFINE_PER_CPU(struct list_head, pi_blocked_vcpu);
+static DEFINE_PER_CPU(spinlock_t, pi_blocked_vcpu_lock);
+
 uint8_t __read_mostly posted_intr_vector;
+uint8_t __read_mostly pi_wakeup_vector;
+
+void vmx_pi_per_cpu_init(unsigned int cpu)
+{
+    INIT_LIST_HEAD(&per_cpu(pi_blocked_vcpu, cpu));
+    spin_lock_init(&per_cpu(pi_blocked_vcpu_lock, cpu));
+}
+
+void vmx_vcpu_block(struct vcpu *v)
+{
+    struct pi_desc old, new;
+    unsigned int dest;
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+    unsigned long flags;
+
+    if ( !has_arch_pdevs(v->domain) )
+        return;
+
+    spin_lock_irqsave(&v->arch.hvm_vmx.pi_lock, flags);
+
+    /*
+     * The vCPU is blocking, we need to add it to one of the per pCPU lists.
+     * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use it for
+     * the per-CPU list, we also save it to posted-interrupt descriptor and
+     * make it as the destination of the wake-up notification event.
+     */
+    v->arch.hvm_vmx.pi_block_cpu = v->processor;
+
+    spin_lock(&per_cpu(pi_blocked_vcpu_lock, v->arch.hvm_vmx.pi_block_cpu));
+    list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
+                  &per_cpu(pi_blocked_vcpu, v->arch.hvm_vmx.pi_block_cpu));
+    spin_unlock(&per_cpu(pi_blocked_vcpu_lock,
+                v->arch.hvm_vmx.pi_block_cpu));
+
+    do {
+        old.control = new.control = pi_desc->control;
+
+        /*
+         * Change the 'NDST' field to v->arch.hvm_vmx.pi_block_cpu,
+         * so when external interrupts from assigned deivces happen,
+         * wakeup notifiction event will go to
+         * v->arch.hvm_vmx.pi_block_cpu, then in pi_wakeup_interrupt()
+         * we can find the vCPU in the right list to wake up.
+         */
+        dest = cpu_physical_id(v->arch.hvm_vmx.pi_block_cpu);
+
+        if ( x2apic_enabled )
+            new.ndst = dest;
+        else
+            new.ndst = MASK_INSR(dest, PI_xAPIC_NDST_MASK);
+
+        pi_clear_sn(&new);
+        new.nv = pi_wakeup_vector;
+    } while ( cmpxchg(&pi_desc->control, old.control, new.control) !=
+              old.control );
+
+    spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_lock, flags);
+}
+
+void vmx_vcpu_block_cancel(struct vcpu *v)
+{
+    unsigned long flags;
+
+    if ( !has_arch_pdevs(v->domain) )
+        return;
+
+    spin_lock_irqsave(&v->arch.hvm_vmx.pi_lock, flags);
+
+    if ( !test_bit(_VPF_blocked, &v->pause_flags) )
+    {
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+        unsigned int pi_block_cpu;
+
+        /* the vCPU is not on any blocking list. */
+        pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
+        if ( pi_block_cpu == NR_CPUS )
+            goto out;
+
+        /*
+         * Set 'NV' field back to posted_intr_vector, so the
+         * Posted-Interrupts can be delivered to the vCPU by
+         * VT-d HW after it is scheduled to run.
+         */
+        write_atomic(&pi_desc->nv, posted_intr_vector);
+
+        spin_lock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+
+        /*
+         * v->arch.hvm_vmx.pi_block_cpu == NR_CPUS here means the vCPU was
+         * removed from the blocking list while we are acquiring the lock.
+         */
+        if ( v->arch.hvm_vmx.pi_block_cpu == NR_CPUS )
+        {
+            spin_unlock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+            goto out;
+        }
+
+        list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+        v->arch.hvm_vmx.pi_block_cpu = NR_CPUS;
+        spin_unlock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+    }
+
+out:
+    spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_lock, flags);
+}
+
+void vmx_vcpu_wake_prepare(struct vcpu *v)
+{
+    unsigned long flags;
+
+    if ( !has_arch_pdevs(v->domain) )
+        return;
+
+    spin_lock_irqsave(&v->arch.hvm_vmx.pi_lock, flags);
+
+    if ( !test_bit(_VPF_blocked, &v->pause_flags) )
+    {
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+        unsigned int pi_block_cpu;
+
+        /* the vCPU is not on any blocking list. */
+        pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
+        if ( pi_block_cpu == NR_CPUS )
+            goto out;
+
+        /*
+         * We cannot set 'SN' here since we don't change 'SN' during lazy
+         * context switch, if we set 'SN' here, we may end up in the situation
+         * that the vCPU is running with 'SN' set.
+         */
+
+        /*
+         * Set 'NV' field back to posted_intr_vector, so the
+         * Posted-Interrupts can be delivered to the vCPU by
+         * VT-d HW after it is scheduled to run.
+         */
+        write_atomic(&pi_desc->nv, posted_intr_vector);
+
+        spin_lock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+
+        /*
+         * v->arch.hvm_vmx.pi_block_cpu == NR_CPUS here means the vCPU was
+         * removed from the blocking list while we are acquiring the lock.
+         */
+        if ( v->arch.hvm_vmx.pi_block_cpu == NR_CPUS )
+        {
+            spin_unlock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+            goto out;
+        }
+
+        list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+        v->arch.hvm_vmx.pi_block_cpu = NR_CPUS;
+        spin_unlock(&per_cpu(pi_blocked_vcpu_lock, pi_block_cpu));
+    }
+
+out:
+    spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_lock, flags);
+}
+
+static void vmx_pi_switch_from(struct vcpu *v)
+{
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    if ( !has_arch_pdevs(v->domain) || !iommu_intpost ||
+         test_bit(_VPF_blocked, &v->pause_flags) )
+        return;
+
+    /*
+     * The vCPU has been preempted or went to sleep. We don't need to send
+     * notification event to a non-running vcpu, the interrupt information
+     * will be delivered to it before VM-ENTRY when the vcpu is scheduled
+     * to run next time.
+     */
+    pi_set_sn(pi_desc);
+}
+
+static void vmx_pi_switch_to(struct vcpu *v)
+{
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    if ( !has_arch_pdevs(v->domain) || !iommu_intpost )
+        return;
+
+    if ( x2apic_enabled )
+        write_atomic(&pi_desc->ndst, cpu_physical_id(v->processor));
+    else
+        write_atomic(&pi_desc->ndst,
+                     MASK_INSR(cpu_physical_id(v->processor),
+                     PI_xAPIC_NDST_MASK));
+
+    pi_clear_sn(pi_desc);
+}
 
 static int vmx_domain_initialise(struct domain *d)
 {
@@ -106,10 +305,24 @@ static int vmx_vcpu_initialise(struct vcpu *v)
 
     spin_lock_init(&v->arch.hvm_vmx.vmcs_lock);
 
+    INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+    INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_vcpu_on_set_list);
+
+    v->arch.hvm_vmx.pi_block_cpu = NR_CPUS;
+
+    spin_lock_init(&v->arch.hvm_vmx.pi_lock);
+
     v->arch.schedule_tail    = vmx_do_resume;
     v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
     v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
 
+    if ( iommu_intpost && has_hvm_container_vcpu(v) )
+    {
+        v->arch.vcpu_block = vmx_vcpu_block;
+        v->arch.vcpu_block_cancel = vmx_vcpu_block_cancel;
+        v->arch.vcpu_wake_prepare = vmx_vcpu_wake_prepare;
+    }
+
     if ( (rc = vmx_create_vmcs(v)) != 0 )
     {
         dprintk(XENLOG_WARNING,
@@ -721,6 +934,7 @@ static void vmx_ctxt_switch_from(struct vcpu *v)
     vmx_save_guest_msrs(v);
     vmx_restore_host_msrs();
     vmx_save_dr(v);
+    vmx_pi_switch_from(v);
 }
 
 static void vmx_ctxt_switch_to(struct vcpu *v)
@@ -745,6 +959,7 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
 
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
+    vmx_pi_switch_to(v);
 }
 
 
@@ -1975,6 +2190,53 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
     .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
+/* Handle VT-d posted-interrupt when VCPU is blocked. */
+static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
+{
+    struct arch_vmx_struct *vmx, *tmp;
+    struct vcpu *v;
+    spinlock_t *lock = &per_cpu(pi_blocked_vcpu_lock, smp_processor_id());
+    struct list_head *blocked_vcpus =
+                       &per_cpu(pi_blocked_vcpu, smp_processor_id());
+    LIST_HEAD(list);
+
+    ack_APIC_irq();
+    this_cpu(irq_count)++;
+
+    spin_lock(lock);
+
+    /*
+     * XXX: The length of the list depends on how many vCPU is current
+     * blocked on this specific pCPU. This may hurt the interrupt latency
+     * if the list grows to too many entries.
+     */
+    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocked_vcpu_list)
+    {
+        if ( pi_test_on(&vmx->pi_desc) )
+        {
+            list_del(&vmx->pi_blocked_vcpu_list);
+            vmx->pi_block_cpu = NR_CPUS;
+
+            /*
+             * We cannot call vcpu_unblock here, since it also needs
+             * 'pi_blocked_vcpu_lock', we store the vCPUs with ON
+             * set in another list and unblock them after we release
+             * 'pi_blocked_vcpu_lock'.
+             */
+            list_add_tail(&vmx->pi_vcpu_on_set_list, &list);
+        }
+    }
+
+    spin_unlock(lock);
+
+    list_for_each_entry_safe(vmx, tmp, &list, pi_vcpu_on_set_list)
+    {
+        v = container_of(vmx, struct vcpu, arch.hvm_vmx);
+        list_del(&vmx->pi_vcpu_on_set_list);
+        vcpu_unblock(v);
+    }
+}
+
 /* Handle VT-d posted-interrupt when VCPU is running. */
 static void pi_notification_interrupt(struct cpu_user_regs *regs)
 {
@@ -2061,7 +2323,10 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_posted_intr_processing )
     {
         if ( iommu_intpost )
+        {
             alloc_direct_apic_vector(&posted_intr_vector, 
pi_notification_interrupt);
+            alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
+        }
         else
             alloc_direct_apic_vector(&posted_intr_vector, 
event_check_interrupt);
     }
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 3eefed7..383fd62 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -412,6 +412,8 @@ void vcpu_wake(struct vcpu *v)
     unsigned long flags;
     spinlock_t *lock = vcpu_schedule_lock_irqsave(v, &flags);
 
+    arch_vcpu_wake_prepare(v);
+
     if ( likely(vcpu_runnable(v)) )
     {
         if ( v->runstate.state >= RUNSTATE_blocked )
@@ -800,10 +802,13 @@ void vcpu_block(void)
 
     set_bit(_VPF_blocked, &v->pause_flags);
 
+    arch_vcpu_block(v);
+
     /* Check for events /after/ blocking: avoids wakeup waiting race. */
     if ( local_events_need_delivery() )
     {
         clear_bit(_VPF_blocked, &v->pause_flags);
+        arch_vcpu_block_cancel(v);
     }
     else
     {
@@ -837,6 +842,8 @@ static long do_poll(struct sched_poll *sched_poll)
     v->poll_evtchn = -1;
     set_bit(v->vcpu_id, d->poll_mask);
 
+    arch_vcpu_block(v);
+
 #ifndef CONFIG_X86 /* set_bit() implies mb() on x86 */
     /* Check for events /after/ setting flags: avoids wakeup waiting race. */
     smp_mb();
@@ -854,6 +861,7 @@ static long do_poll(struct sched_poll *sched_poll)
 #endif
 
     rc = 0;
+
     if ( local_events_need_delivery() )
         goto out;
 
@@ -887,6 +895,7 @@ static long do_poll(struct sched_poll *sched_poll)
     v->poll_evtchn = 0;
     clear_bit(v->vcpu_id, d->poll_mask);
     clear_bit(_VPF_blocked, &v->pause_flags);
+    arch_vcpu_block_cancel(v);
     return rc;
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 56aa208..ec2b536 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -301,6 +301,10 @@ static inline register_t vcpuid_to_vaffinity(unsigned int 
vcpuid)
     return vaff;
 }
 
+static inline void arch_vcpu_wake_prepare(struct vcpu *v) {}
+static inline void arch_vcpu_block(struct vcpu *v) {}
+static inline void arch_vcpu_block_cancel(struct vcpu *v) {}
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 0fce09e..c37af4e 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -481,6 +481,10 @@ struct arch_vcpu
     void (*ctxt_switch_from) (struct vcpu *);
     void (*ctxt_switch_to) (struct vcpu *);
 
+    void (*vcpu_block) (struct vcpu *);
+    void (*vcpu_block_cancel) (struct vcpu *);
+    void (*vcpu_wake_prepare) (struct vcpu *);
+
     struct vpmu_struct vpmu;
 
     /* Virtual Machine Extensions */
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 3cac64f..4cd7fcb 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -545,6 +545,10 @@ static inline bool_t hvm_altp2m_supported(void)
     return hvm_funcs.altp2m_supported;
 }
 
+void arch_vcpu_wake_prepare(struct vcpu *v);
+void arch_vcpu_block(struct vcpu *v);
+void arch_vcpu_block_cancel(struct vcpu *v);
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h 
b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 81c9e63..23f8192 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -160,6 +160,17 @@ struct arch_vmx_struct {
     struct page_info     *vmwrite_bitmap;
 
     struct page_info     *pml_pg;
+
+    struct list_head     pi_blocked_vcpu_list;
+    struct list_head     pi_vcpu_on_set_list;
+
+    /*
+     * Before vCPU is blocked, it is added to the global per-cpu list
+     * of 'pi_block_cpu', then VT-d engine can send wakeup notification
+     * event to 'pi_block_cpu' and wakeup the related vCPU.
+     */
+    unsigned int         pi_block_cpu;
+    spinlock_t           pi_lock;
 };
 
 int vmx_create_vmcs(struct vcpu *v);
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h 
b/xen/include/asm-x86/hvm/vmx/vmx.h
index 70b254f..2eaea32 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -28,6 +28,8 @@
 #include <asm/hvm/trace.h>
 #include <asm/hvm/vmx/vmcs.h>
 
+extern uint8_t pi_wakeup_vector;
+
 typedef union {
     struct {
         u64 r       :   1,  /* bit 0 - Read permission */
@@ -557,6 +559,8 @@ int alloc_p2m_hap_data(struct p2m_domain *p2m);
 void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
+void vmx_pi_per_cpu_init(unsigned int cpu);
+
 /* EPT violation qualifications definitions */
 #define _EPT_READ_VIOLATION         0
 #define EPT_READ_VIOLATION          (1UL<<_EPT_READ_VIOLATION)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.