[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: Wednesday, June 10, 2015 2:44 PM > To: Wu, Feng > Cc: andrew.cooper3@xxxxxxxxxx; george.dunlap@xxxxxxxxxxxxx; Tian, Kevin; > Zhang, Yang Z; xen-devel@xxxxxxxxxxxxx; keir@xxxxxxx > Subject: Re: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU > scheduling > > >>> On 08.05.15 at 11:07, <feng.wu@xxxxxxxxx> wrote: > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector) > > __vmwrite(GUEST_INTR_STATUS, status); > > } > > > > +static void vmx_pi_desc_update(struct vcpu *v, int old_state) > > +{ > > + struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc; > > + struct pi_desc old, new; > > + unsigned long flags; > > + > > + if ( !iommu_intpost ) > > + return; > > Considering how the adjustment to start_vmx(), this at best should > be an ASSERT() (if a check is needed at all). > > > + switch ( v->runstate.state ) > > + { > > + case RUNSTATE_runnable: > > + case RUNSTATE_offline: > > + /* > > + * We don't need to send notification event to a non-running > > + * vcpu, the interrupt information will be delivered to it before > > + * VM-ENTRY when the vcpu is scheduled to run next time. > > + */ > > + pi_desc->sn = 1; > > + > > + /* > > + * If the state is transferred from RUNSTATE_blocked, > > + * we should set 'NV' feild back to posted_intr_vector, > > + * so the Posted-Interrupts can be delivered to the vCPU > > + * by VT-d HW after it is scheduled to run. > > + */ > > + if ( old_state == RUNSTATE_blocked ) > > + { > > + do > > + { > > + old.control = new.control = pi_desc->control; > > + new.nv = posted_intr_vector; > > + } > > + while ( cmpxchg(&pi_desc->control, old.control, new.control) > > + != old.control ); > > So why is it okay to do the SN update non-atomically when the > vector update is done atomically? After thinking this a bit more. It should be atomic to update the 'SN' filed, since the hardware will perform a coherent atomic read-modify-write operation of the posted-interrupt descriptor when posting interrupts. > > Also the curly braces do _not_ go on separate lines for do/while > constructs. > > And then do you really need to atomically update the whole 64 bits > here, rather than just the nv field? By not making it a bit field > you could perhaps just write_atomic() it? I think only atomically updating the nv field is fine. > > > + > > + /* > > + * Delete the vCPU from the related block list > > + * if we are resuming from blocked state > > + */ > > + spin_lock_irqsave(&per_cpu(blocked_vcpu_lock, > > + v->pre_pcpu), flags); > > + list_del(&v->blocked_vcpu_list); > > + spin_unlock_irqrestore(&per_cpu(blocked_vcpu_lock, > > + v->pre_pcpu), flags); > > + } > > + break; > > + > > + case RUNSTATE_blocked: > > + /* > > + * The vCPU is blocked on the block list. > > + * Add the blocked vCPU on the list of the > > + * vcpu->pre_pcpu, which is the destination > > + * of the wake-up notification event. > > + */ > > + v->pre_pcpu = v->processor; > > Is latching this upon runstate change really enough? I.e. what about > the v->processor changes that sched_move_domain() or individual > schedulers do? Or if it really just matters on which CPU's blocked list > the vCPU is (while its ->processor changing subsequently doesn't > matter) I'd like to see the field named more after its purpose (e.g. > pi_block_cpu; list and lock should btw also have a connection to PI > in their names). Yes, It doesn't matter if vCPU changes. The key point is that we put the vCPU on a pCPU list and we change the NDST field to this pCPU, then the wakeup notification event will get there. You are right, I need to rename them to reflect the real purpose of it. > > In the end, if the placement on a list followed v->processor, you > would likely get away without the extra new field. Are there > synchronization constraints speaking against such a model? I don't understand this quit well. Do you mean using 'v->processor' as the pCPU list for the blocked vCPUs? Then what about 'v->processor' changes, seems we cannot handle this case. > > > + spin_lock_irqsave(&per_cpu(blocked_vcpu_lock, > > + v->pre_pcpu), flags); > > + list_add_tail(&v->blocked_vcpu_list, > > + &per_cpu(blocked_vcpu, v->pre_pcpu)); > > + spin_unlock_irqrestore(&per_cpu(blocked_vcpu_lock, > > + v->pre_pcpu), flags); > > + > > + do > > + { > > + old.control = new.control = pi_desc->control; > > + > > + /* > > + * We should not block the vCPU if > > + * an interrupt was posted for it. > > + */ > > + > > + if ( old.on == 1 ) > > I think I said so elsewhere, but just in case I didn't: Please don't > compare boolean fields with 1 - e.g. in the case here "if ( old.on )" > suffices. > > > + { > > + /* > > + * The vCPU will be removed from the block list > > + * during its state transferring from RUNSTATE_blocked > > + * to RUNSTATE_runnable after the following tasklet > > + * is scheduled to run. > > + */ > > Precise comments please - I don't think the scheduling of the > tasklet makes any difference; I suppose it's the execution of it > that does. > > > + tasklet_schedule(&v->vcpu_wakeup_tasklet); > > + return; > > + } > > + > > + /* > > + * Change the 'NDST' field to v->pre_pcpu, so when > > + * external interrupts from assigned deivces happen, > > + * wakeup notifiction event will go to v->pre_pcpu, > > + * then in pi_wakeup_interrupt() we can find the > > + * vCPU in the right list to wake up. > > + */ > > + if ( x2apic_enabled ) > > + new.ndst = cpu_physical_id(v->pre_pcpu); > > + else > > + new.ndst = MASK_INSR(cpu_physical_id(v->pre_pcpu), > > + PI_xAPIC_NDST_MASK); > > + new.sn = 0; > > + new.nv = pi_wakeup_vector; > > + } > > + while ( cmpxchg(&pi_desc->control, old.control, new.control) > > + != old.control ); > > + break; > > + > > + case RUNSTATE_running: > > + ASSERT( pi_desc->sn == 1 ); > > + > > + do > > + { > > + old.control = new.control = pi_desc->control; > > + if ( x2apic_enabled ) > > + new.ndst = cpu_physical_id(v->processor); > > + else > > + new.ndst = (cpu_physical_id(v->processor) << 8) & > 0xFF00; > > Why are you not using MASK_INSR() here like you do above? > > > + > > + new.sn = 0; > > + } > > + while ( cmpxchg(&pi_desc->control, old.control, new.control) > > + != old.control ); > > Same here - can't this be a write_atomic() of ndst and a clear_bit() > of sn instead of a loop over cmpxchg()? Your suggestion is good! > > > + break; > > + > > + default: > > + break; > > This seems dangerous: I think you want at least an > ASSERT_UNREACHABLE() here. > > > @@ -1842,7 +1967,12 @@ const struct hvm_function_table * __init > start_vmx(void) > > alloc_direct_apic_vector(&posted_intr_vector, > pi_notification_interrupt); > > > > if ( iommu_intpost ) > > + { > > alloc_direct_apic_vector(&pi_wakeup_vector, > pi_wakeup_interrupt); > > + vmx_function_table.pi_desc_update = vmx_pi_desc_update; > > + } > > + else > > + vmx_function_table.pi_desc_update = NULL; > > Isn't that field NULL anyway? > > > @@ -157,7 +158,11 @@ static inline void vcpu_runstate_change( > > v->runstate.state_entry_time = new_entry_time; > > } > > > > + old_state = v->runstate.state; > > v->runstate.state = new_state; > > + > > + if ( is_hvm_vcpu(v) && hvm_funcs.pi_desc_update ) > > + hvm_funcs.pi_desc_update(v, old_state); > > I don't see how this would build on ARM. So what about adding " #ifdef CONFIG_X86 ..." here? Thanks, Feng > > > --- a/xen/include/xen/sched.h > > +++ b/xen/include/xen/sched.h > > @@ -142,6 +142,8 @@ struct vcpu > > > > int processor; > > > > + int pre_pcpu; > > This again doesn't belong into the common structure (and should > be accompanied by a comment, and should be unsigned int). > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |