[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Fwd: [v3 14/15] Update Posted-Interrupts Descriptor during vCPU scheduling
> -----Original Message----- > From: Dario Faggioli [mailto:dario.faggioli@xxxxxxxxxx] > Sent: Friday, July 10, 2015 7:06 PM > To: Jan Beulich > Cc: Wu, Feng; andrew.cooper3@xxxxxxxxxx; George Dunlap; Tian, Kevin; Zhang, > Yang Z; xen-devel; keir@xxxxxxx > Subject: Re: [Xen-devel] Fwd: [v3 14/15] Update Posted-Interrupts Descriptor > during vCPU scheduling > > On Fri, 2015-07-10 at 07:22 +0100, Jan Beulich wrote: > > >>> On 10.07.15 at 07:59, <feng.wu@xxxxxxxxx> wrote: > > > If you agree with doing all this in a central place, maybe we can create > > > an arch hook for 'struct scheduler' to do this and call it in all the > > > places > > > vcpu_runstate_change() gets called. What is your opinion about this? > > > > Doing this in a central place is certainly the right approach, but > > adding an arch hook that needs to be called everywhere > > vcpu_runstate_change() wouldn't serve that purpose. > > > Indeed. > > > Instead > > we'd need to replace all current vcpu_runstate_change() calls > > with calls to a new function calling both this and the to be added > > arch hook. > > > Well, I also see the value of having this done in one place, but not to > the point of adding something like this. > > > But please wait for George's / Dario's feedback, because they > > seem to be even less convinced than me about your model of > > tying the updates to runstate changes. > > > Indeed. George stated very well the reason why vcpu_runstate_change() > should not be used, and suggested arch hooks to be added in the relevant > places. I particularly like this idea as, not only it would leave > vcpu_runstate_change() alone, but it would also help disentangling this > from runstates, which, IMO, is also important. > > So, can we identify the state (runstate? :-/) transitions that needs > intercepting, and find a suitable place where to place hooks? I mean, > something like this: > > - running-->blocked: can be handled in the arch specific part of > context switch (similarly to CMT/CAT, which > already hooks into there). So, in this case, no > need to add any hook, as arch specific code is > called already; > > - running-->runnable: same as above; > > - running-->offline: not sure if you need to take action on this. If > yes, context switch should be fine as well; > > - blocked-->runnable: I think we need this, don't we? If yes, we > probably want an arch hook in vcpu_wake(); > > - blocked-->offline: do you need it? Well, the hook in wake should work > for this as well, if yes; > > - runnable/running-->offline: if necessary, we want an hook in > vcpu_sleep_nosync(). > > Another way to look at this, less biased toward runstates (i.e., what > I've been asking for since a while), would be: > > - do you need to perform an action upon context switch (on prev and/or > next vcpu)? If yes, there's an arch specific path in there already; > - do you need to perform an action when a vcpu wakes-up? If yes, we > need an arch hook in vcpu_wake(); > - do you need to perform an action when a vcpu goes to sleep? If yes, > we need an arch hook in vcpu_sleep_nosync(); > > I think this makes a more than fair solution. I happen to like it even > better than the centralized approach, actually! That is for personal > taste, but also because I think it may be useful for others too, in > future, to be able to execute arch specific code, e.g., upon wakes-up, > in which case we will be able to use the hook that we're introducing > here for PI. > > Thanks and Regards, > Dario Hi Dario, Thanks for the suggestion! I made a draft patch for this idea, It may have some issues since It is just a draft version, kind of like prototype, I post it here just like to know whether it is meet your expectation, if it is I can continue with this direction and this may speed up the upstreaming process. Thanks a lot! diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 6eebc1a..7e678c8 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -740,6 +740,81 @@ static void vmx_ctxt_switch_from(struct vcpu *v) vmx_save_guest_msrs(v); vmx_restore_host_msrs(); vmx_save_dr(v); + + if ( iommu_intpost ) + { + struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc; + struct pi_desc old, new; + unsigned long flags; + + if ( vcpu_runnable(v) || !test_bit(_VPF_blocked, &v->pause_flags) ) + { + /* + * The vCPU is preempted or sleeped. We don't need to send + * notification event to a non-running vcpu, the interrupt + * information will be delivered to it before VM-ENTRY when + * the vcpu is scheduled to run next time. + */ + pi_set_sn(pi_desc); + + } + else if ( test_bit(_VPF_blocked, &v->pause_flags) ) + { + /* The vCPU is blocked */ + ASSERT(v->arch.hvm_vmx.pi_block_cpu == -1); + + /* + * The vCPU is blocked on the block list. Add the blocked + * vCPU on the list of the v->arch.hvm_vmx.pi_block_cpu, + * which is the destination of the wake-up notification event. + */ + v->arch.hvm_vmx.pi_block_cpu = v->processor; + spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, + v->arch.hvm_vmx.pi_block_cpu), flags); + list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, + &per_cpu(pi_blocked_vcpu, v->arch.hvm_vmx.pi_block_cpu)); + spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, + v->arch.hvm_vmx.pi_block_cpu), flags); + + do { + old.control = new.control = pi_desc->control; + + /* + * We should not block the vCPU if + * an interrupt was posted for it. + */ + + if ( old.on ) + { + /* + * The vCPU will be removed from the block list + * during its state transferring from RUNSTATE_blocked + * to RUNSTATE_runnable after the following tasklet + * is executed. + */ + tasklet_schedule(&v->arch.hvm_vmx.pi_vcpu_wakeup_tasklet); + return; + } + + /* + * Change the 'NDST' field to v->arch.hvm_vmx.pi_block_cpu, + * so when external interrupts from assigned deivces happen, + * wakeup notifiction event will go to + * v->arch.hvm_vmx.pi_block_cpu, then in pi_wakeup_interrupt() + * we can find the vCPU in the right list to wake up. + */ + if ( x2apic_enabled ) + new.ndst = cpu_physical_id(v->arch.hvm_vmx.pi_block_cpu); + else + new.ndst = MASK_INSR(cpu_physical_id( + v->arch.hvm_vmx.pi_block_cpu), + PI_xAPIC_NDST_MASK); + new.sn = 0; + new.nv = pi_wakeup_vector; + } while ( cmpxchg(&pi_desc->control, old.control, new.control) + != old.control ); + } + } } static void vmx_ctxt_switch_to(struct vcpu *v) @@ -764,6 +839,22 @@ static void vmx_ctxt_switch_to(struct vcpu *v) vmx_restore_guest_msrs(v); vmx_restore_dr(v); + + if ( iommu_intpost ) + { + struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc; + + ASSERT( pi_desc->sn == 1 ); + + if ( x2apic_enabled ) + write_atomic(&pi_desc->ndst, cpu_physical_id(v->processor)); + else + write_atomic(&pi_desc->ndst, + MASK_INSR(cpu_physical_id(v->processor), + PI_xAPIC_NDST_MASK)); + + pi_clear_sn(pi_desc); + } } @@ -1900,6 +1991,42 @@ static void vmx_pi_desc_update(struct vcpu *v, int old_state) } } +void arch_vcpu_wake(struct vcpu *v) +{ + if ( !iommu_intpost || (v->runstate.state != RUNSTATE_blocked) ) + return; + + if ( likely(vcpu_runnable(v)) || + !test_bit(_VPF_blocked, &v->pause_flags) ) + { + struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc; + unsigned long flags; + + /* + * blocked -> runnable/offline + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ + write_atomic((uint8_t*)&pi_desc->nv, posted_intr_vector); + + /* + * Delete the vCPU from the related block list + * if we are resuming from blocked state + */ + if (v->arch.hvm_vmx.pi_block_cpu != -1) + { + spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, + v->arch.hvm_vmx.pi_block_cpu), flags); + list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list); + spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, + v->arch.hvm_vmx.pi_block_cpu), flags); + v->arch.hvm_vmx.pi_block_cpu = -1; + } + } +} + void vmx_hypervisor_cpuid_leaf(uint32_t sub_idx, uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 20727d6..7b08797 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -397,6 +397,8 @@ void vcpu_wake(struct vcpu *v) vcpu_runstate_change(v, RUNSTATE_offline, NOW()); } + arch_vcpu_wake(v); + vcpu_schedule_unlock_irqrestore(lock, flags, v); TRACE_2D(TRC_SCHED_WAKE, v->domain->domain_id, v->vcpu_id); diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index 9603cf0..be5aebf 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -266,6 +266,7 @@ static inline unsigned int domain_max_vcpus(const struct domain *d) } static void arch_pi_desc_update(struct vcpu *v, int old_state) {} +static void arch_vcpu_wake(struct vcpu *v) {} #endif /* __ASM_DOMAIN_H__ */ diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index e175417..38c796c 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -511,6 +511,7 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v); enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v); void arch_pi_desc_update(struct vcpu *v, int old_state); +void arch_vcpu_wake(struct vcpu *v); #ifndef NDEBUG /* Permit use of the Forced Emulation Prefix in HVM guests */ > > -- > <<This happens because I choose it to happen!>> (Raistlin Majere) > ----------------------------------------------------------------- > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |