[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted interrupts




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Thursday, September 10, 2015 5:26 PM
> To: Wu, Feng
> Cc: Andrew Cooper; Dario Faggioli; George Dunlap; Tian, Kevin;
> xen-devel@xxxxxxxxxxxxx; Keir Fraser
> Subject: RE: [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted
> interrupts
> 
> >>> On 10.09.15 at 10:59, <feng.wu@xxxxxxxxx> wrote:
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: Thursday, September 10, 2015 4:28 PM
> >> >>> On 10.09.15 at 04:07, <feng.wu@xxxxxxxxx> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> Sent: Wednesday, September 09, 2015 6:27 PM
> >> >> >>> On 09.09.15 at 10:56, <feng.wu@xxxxxxxxx> wrote:
> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> >> Sent: Monday, September 07, 2015 8:55 PM
> >> >> >> >>> On 25.08.15 at 03:57, <feng.wu@xxxxxxxxx> wrote:
> >> >> >> > +
> >> >> >> > +        /*
> >> >> >> > +         * Delete the vCPU from the related block list
> >> >> >> > +         * if we are resuming from blocked state.
> >> >> >> > +         */
> >> >> >> > +        if ( v->arch.hvm_vmx.pi_block_cpu != -1 )
> >> >> >> > +        {
> >> >> >> > +            spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
> >> >> >> > +
> v->arch.hvm_vmx.pi_block_cpu),
> >> >> >> flags);
> >> >> >> > +
> list_del_init(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
> >> >> >>
> >> >> >> Shouldn't you set .pi_block_cpu back to -1 along with removing
> >> >> >> the vCPU from the list? Which then ought to allow using just
> >> >> >> list_del() here?
> >> >> >
> >> >> > Here is the story about using list_del_init() instead of list_del(), 
> >> >> > (the
> >> >> > same reason for using list_del_init() in pi_wakeup_interrupt() ).
> >> >> > If we use list_del(), that means we need to set .pi_block_cpu to
> >> >> > -1 after removing the vCPU from the block list, there are two
> >> >> > places where the vCPU is removed from the list:
> >> >> > - arch_vcpu_wake_prepare()
> >> >> > - pi_wakeup_interrupt()
> >> >> >
> >> >> > That means we also need to set .pi_block_cpu to -1 in
> >> >> > pi_wakeup_interrupt(), which seems problematic to me, since
> >> >> > .pi_block_cpu is used to get the per-CPU spin lock, it is not safe
> >> >> > to change it in pi_wakeup_interrupt(), maybe someone is using
> >> >> > it at the same time.
> >> >> >
> >> >> > And the check "if ( v->arch.hvm_vmx.pi_block_cpu != -1 )" here
> >> >> > is only used when .pi_block_cpu is set to -1 at the initial time.
> >> >> >
> >> >> > If you have any good suggestions about this, I will be all ears.
> >> >>
> >> >> Latching pi_block_cpu into a local variable would take care of that
> >> >> part of the problem. Of course you then may also need to check
> >> >> that while waiting to acquire the lock pi_block_cpu didn't change.
> >> >
> >> > Thanks for the suggestion! But I think it is not easy to "check
> >> > "that while waiting to acquire the lock pi_block_cpu didn't change".
> >> > Let's take the following pseudo code as an example:
> >> >
> >> > int local_pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
> >> >
> >> > ......
> >> >
> >> > spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
> >> >                local_pi_block_cpu), flags);
> >> > list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
> >> > spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
> >> >                    local_pi_block_cpu), flags);
> >> >
> >> > If .pi_block_cpu is changed to -1 by others during calling list_del()
> >> > above, that means the vCPU is removed by others, then calling
> >> > list_del() here again would be problematic. I think there might be
> >> > other corner cases for this. So I suggest adding some comments
> >> > for list_del_init() as you mentioned below.
> >>
> >> That's why I said "check that while waiting to acquire the lock
> >> pi_block_cpu didn't change." The code you present does not do
> >> this.
> >
> > I didn't see we need implement it as the above code, I just list it
> > here the show it is hard to do that.
> > First, how to check it while waiting to acquire the lock .pi_block_cpu
> > didn't change?
> 
> Note the difference between "check while waiting" and "check that
> while waiting": The former is indeed hard to implement, while the
> latter is pretty straightforward (and we do so elsewhere).
> 
> > Secondly, even if we can check it, what should we do if .pi_block_cpu
> > is changed after acquiring the lock as I mentioned above?
> 
> Drop the lock and start over. I.e. (taking your pseudo code)
> 
> restart:
>     local_pi_block_cpu = ...;
>     bail-if-invalid (e.g. -1 in current model)
>     spin_lock_irqsave(&per_cpu(, local_pi_block_cpu), flags);
>     if(local_pi_block_cpu != actual_pi_block_cpu) {
>         spin_unlock_irqrestore(&per_cpu(,local_pi_block_cpu), flags);
>         goto restart;
>     }

Thanks a lot for showing me this pseudo code! My concern is if
.pi_block_vcpu is changed to -1 at this point, it doesn't work.
.pi_block_vcpu being -1 here means the vCPU is remove from
the blocking list by others, then we cannot delete it again via
list_del() here.

BTW, I cannot see performance overhead for list_del_init()
compared to list_del().

list_del():
static inline void list_del(struct list_head *entry)
{
    ASSERT(entry->next->prev == entry);
    ASSERT(entry->prev->next == entry);
    __list_del(entry->prev, entry->next);
    entry->next = LIST_POISON1;
    entry->prev = LIST_POISON2;
}

list_del_init():
static inline void list_del_init(struct list_head *entry)
{
    __list_del(entry->prev, entry->next);
    INIT_LIST_HEAD(entry);
}

Thanks,
Feng

>     list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
>     spin_unlock_irqrestore(&per_cpu(,local_pi_block_cpu), flags);
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.