|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
>>> On 08.05.15 at 11:07, <feng.wu@xxxxxxxxx> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector)
> __vmwrite(GUEST_INTR_STATUS, status);
> }
>
> +static void vmx_pi_desc_update(struct vcpu *v, int old_state)
> +{
> + struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> + struct pi_desc old, new;
> + unsigned long flags;
> +
> + if ( !iommu_intpost )
> + return;
Considering how the adjustment to start_vmx(), this at best should
be an ASSERT() (if a check is needed at all).
> + switch ( v->runstate.state )
> + {
> + case RUNSTATE_runnable:
> + case RUNSTATE_offline:
> + /*
> + * We don't need to send notification event to a non-running
> + * vcpu, the interrupt information will be delivered to it before
> + * VM-ENTRY when the vcpu is scheduled to run next time.
> + */
> + pi_desc->sn = 1;
> +
> + /*
> + * If the state is transferred from RUNSTATE_blocked,
> + * we should set 'NV' feild back to posted_intr_vector,
> + * so the Posted-Interrupts can be delivered to the vCPU
> + * by VT-d HW after it is scheduled to run.
> + */
> + if ( old_state == RUNSTATE_blocked )
> + {
> + do
> + {
> + old.control = new.control = pi_desc->control;
> + new.nv = posted_intr_vector;
> + }
> + while ( cmpxchg(&pi_desc->control, old.control, new.control)
> + != old.control );
So why is it okay to do the SN update non-atomically when the
vector update is done atomically?
Also the curly braces do _not_ go on separate lines for do/while
constructs.
And then do you really need to atomically update the whole 64 bits
here, rather than just the nv field? By not making it a bit field
you could perhaps just write_atomic() it?
> +
> + /*
> + * Delete the vCPU from the related block list
> + * if we are resuming from blocked state
> + */
> + spin_lock_irqsave(&per_cpu(blocked_vcpu_lock,
> + v->pre_pcpu), flags);
> + list_del(&v->blocked_vcpu_list);
> + spin_unlock_irqrestore(&per_cpu(blocked_vcpu_lock,
> + v->pre_pcpu), flags);
> + }
> + break;
> +
> + case RUNSTATE_blocked:
> + /*
> + * The vCPU is blocked on the block list.
> + * Add the blocked vCPU on the list of the
> + * vcpu->pre_pcpu, which is the destination
> + * of the wake-up notification event.
> + */
> + v->pre_pcpu = v->processor;
Is latching this upon runstate change really enough? I.e. what about
the v->processor changes that sched_move_domain() or individual
schedulers do? Or if it really just matters on which CPU's blocked list
the vCPU is (while its ->processor changing subsequently doesn't
matter) I'd like to see the field named more after its purpose (e.g.
pi_block_cpu; list and lock should btw also have a connection to PI
in their names).
In the end, if the placement on a list followed v->processor, you
would likely get away without the extra new field. Are there
synchronization constraints speaking against such a model?
> + spin_lock_irqsave(&per_cpu(blocked_vcpu_lock,
> + v->pre_pcpu), flags);
> + list_add_tail(&v->blocked_vcpu_list,
> + &per_cpu(blocked_vcpu, v->pre_pcpu));
> + spin_unlock_irqrestore(&per_cpu(blocked_vcpu_lock,
> + v->pre_pcpu), flags);
> +
> + do
> + {
> + old.control = new.control = pi_desc->control;
> +
> + /*
> + * We should not block the vCPU if
> + * an interrupt was posted for it.
> + */
> +
> + if ( old.on == 1 )
I think I said so elsewhere, but just in case I didn't: Please don't
compare boolean fields with 1 - e.g. in the case here "if ( old.on )"
suffices.
> + {
> + /*
> + * The vCPU will be removed from the block list
> + * during its state transferring from RUNSTATE_blocked
> + * to RUNSTATE_runnable after the following tasklet
> + * is scheduled to run.
> + */
Precise comments please - I don't think the scheduling of the
tasklet makes any difference; I suppose it's the execution of it
that does.
> + tasklet_schedule(&v->vcpu_wakeup_tasklet);
> + return;
> + }
> +
> + /*
> + * Change the 'NDST' field to v->pre_pcpu, so when
> + * external interrupts from assigned deivces happen,
> + * wakeup notifiction event will go to v->pre_pcpu,
> + * then in pi_wakeup_interrupt() we can find the
> + * vCPU in the right list to wake up.
> + */
> + if ( x2apic_enabled )
> + new.ndst = cpu_physical_id(v->pre_pcpu);
> + else
> + new.ndst = MASK_INSR(cpu_physical_id(v->pre_pcpu),
> + PI_xAPIC_NDST_MASK);
> + new.sn = 0;
> + new.nv = pi_wakeup_vector;
> + }
> + while ( cmpxchg(&pi_desc->control, old.control, new.control)
> + != old.control );
> + break;
> +
> + case RUNSTATE_running:
> + ASSERT( pi_desc->sn == 1 );
> +
> + do
> + {
> + old.control = new.control = pi_desc->control;
> + if ( x2apic_enabled )
> + new.ndst = cpu_physical_id(v->processor);
> + else
> + new.ndst = (cpu_physical_id(v->processor) << 8) & 0xFF00;
Why are you not using MASK_INSR() here like you do above?
> +
> + new.sn = 0;
> + }
> + while ( cmpxchg(&pi_desc->control, old.control, new.control)
> + != old.control );
Same here - can't this be a write_atomic() of ndst and a clear_bit()
of sn instead of a loop over cmpxchg()?
> + break;
> +
> + default:
> + break;
This seems dangerous: I think you want at least an
ASSERT_UNREACHABLE() here.
> @@ -1842,7 +1967,12 @@ const struct hvm_function_table * __init
> start_vmx(void)
> alloc_direct_apic_vector(&posted_intr_vector,
> pi_notification_interrupt);
>
> if ( iommu_intpost )
> + {
> alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
> + vmx_function_table.pi_desc_update = vmx_pi_desc_update;
> + }
> + else
> + vmx_function_table.pi_desc_update = NULL;
Isn't that field NULL anyway?
> @@ -157,7 +158,11 @@ static inline void vcpu_runstate_change(
> v->runstate.state_entry_time = new_entry_time;
> }
>
> + old_state = v->runstate.state;
> v->runstate.state = new_state;
> +
> + if ( is_hvm_vcpu(v) && hvm_funcs.pi_desc_update )
> + hvm_funcs.pi_desc_update(v, old_state);
I don't see how this would build on ARM.
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -142,6 +142,8 @@ struct vcpu
>
> int processor;
>
> + int pre_pcpu;
This again doesn't belong into the common structure (and should
be accompanied by a comment, and should be unsigned int).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |