[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted interrupts




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Wednesday, September 09, 2015 6:27 PM
> To: Wu, Feng
> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin;
> xen-devel@xxxxxxxxxxxxx; Keir Fraser
> Subject: RE: [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted
> interrupts
> 
> >>> On 09.09.15 at 10:56, <feng.wu@xxxxxxxxx> wrote:
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: Monday, September 07, 2015 8:55 PM
> >> >>> On 25.08.15 at 03:57, <feng.wu@xxxxxxxxx> wrote:
> >> > +
> >> > +        /*
> >> > +         * Delete the vCPU from the related block list
> >> > +         * if we are resuming from blocked state.
> >> > +         */
> >> > +        if ( v->arch.hvm_vmx.pi_block_cpu != -1 )
> >> > +        {
> >> > +            spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
> >> > +                              v->arch.hvm_vmx.pi_block_cpu),
> >> flags);
> >> > +            list_del_init(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
> >>
> >> Shouldn't you set .pi_block_cpu back to -1 along with removing
> >> the vCPU from the list? Which then ought to allow using just
> >> list_del() here?
> >
> > Here is the story about using list_del_init() instead of list_del(), (the
> > same reason for using list_del_init() in pi_wakeup_interrupt() ).
> > If we use list_del(), that means we need to set .pi_block_cpu to
> > -1 after removing the vCPU from the block list, there are two
> > places where the vCPU is removed from the list:
> > - arch_vcpu_wake_prepare()
> > - pi_wakeup_interrupt()
> >
> > That means we also need to set .pi_block_cpu to -1 in
> > pi_wakeup_interrupt(), which seems problematic to me, since
> > .pi_block_cpu is used to get the per-CPU spin lock, it is not safe
> > to change it in pi_wakeup_interrupt(), maybe someone is using
> > it at the same time.
> >
> > And the check "if ( v->arch.hvm_vmx.pi_block_cpu != -1 )" here
> > is only used when .pi_block_cpu is set to -1 at the initial time.
> >
> > If you have any good suggestions about this, I will be all ears.
> 
> Latching pi_block_cpu into a local variable would take care of that
> part of the problem. Of course you then may also need to check
> that while waiting to acquire the lock pi_block_cpu didn't change.

Thanks for the suggestion! But I think it is not easy to "check
"that while waiting to acquire the lock pi_block_cpu didn't change".
Let's take the following pseudo code as an example:

int local_pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;

......

spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
               local_pi_block_cpu), flags);
list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
                   local_pi_block_cpu), flags);

If .pi_block_cpu is changed to -1 by others during calling list_del()
above, that means the vCPU is removed by others, then calling
list_del() here again would be problematic. I think there might be
other corner cases for this. So I suggest adding some comments
for list_del_init() as you mentioned below.

> And if that absolutely can't be made work correctly, these
> apparently strange uses of list_del_init() would each need a
> justifying comment.
> 
> >> > +        do {
> >> > +            old.control = new.control = pi_desc->control;
> >> > +
> >> > +            /* Should not block the vCPU if an interrupt was posted for
> it.
> >> */
> >> > +            if ( pi_test_on(&old) )
> >> > +            {
> >> > +                spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_lock,
> >> gflags);
> >> > +                vcpu_unblock(v);
> >>
> >> Calling vcpu_unblock() in the middle of context_switch()? Why? And
> >> is this safe?
> >
> > I cannot see anything unsafe so far, can some scheduler maintainer
> > help to confirm it? Dario? George?
> 
> But first and foremost you ought to answer the "why".

I think if you think this solution is unsafe, you need point out where it is, 
not
just ask "is this safe ?", I don't think this is an effective comments.

Anyway, I go through the main path of the code as below, I really don't find
anything unsafe here.

context_switch() -->
    pi_ctxt_switch_from() -->
        vmx_pre_ctx_switch_pi() -->
            vcpu_unblock() -->
                vcpu_wake() -->
                    SCHED_OP(VCPU2OP(v), wake, v) -->
                        csched_vcpu_wake() -->
                            __runq_insert()
                            __runq_tickle()

If you continue to think it is unsafe, pointing out the place will be welcome!

> 
> >> And if really needed to cover something not dealt with
> >> elsewhere, wouldn't this need to happen _after_ having switched
> >> the notification mechanism below?
> >
> > If ON is set, we don't need to block the vCPU hence no need to change the
> > notification vector to the wakeup one, which is used when vCPU is blocked.
> >
> >>
> >> > +                return;
> 
> Oh, right, I somehow managed to ignore the "return" here.
> 
> >> > +static void vmx_post_ctx_switch_pi(struct vcpu *v)
> >> > +{
> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> >> > +
> >> > +    if ( !has_arch_pdevs(v->domain) )
> >> > +        return;
> >> > +
> >> > +    if ( x2apic_enabled )
> >> > +        write_atomic(&pi_desc->ndst, cpu_physical_id(v->processor));
> >> > +    else
> >> > +        write_atomic(&pi_desc->ndst,
> >> > +                     MASK_INSR(cpu_physical_id(v->processor),
> >> > +                     PI_xAPIC_NDST_MASK));
> >> > +
> >> > +    pi_clear_sn(pi_desc);
> >> > +}
> >>
> >> So you alter where notifications go, but not via which vector? How
> >> is the vCPU going to get removed from the blocked list then?
> >
> > vmx_post_ctx_switch_pi() is called when the vCPU is about to run,
> > in that case, the vector has been set properly and it has been removed
> > from the block list if it was blocked before.
> 
> So why do you two updates then (first [elsewhere] to set the vector
> and then here to set the destination)?

When the vCPU is unblocked and then removed from the blocking list, we
need to change the vector to the normal notification vector, since the
wakeup vector is only used when the vCPU is blocked and hence in the
blocked list. And here is the place we can decide which pCPU the vCPU
will be scheduled on, so we change the destination field here.

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.