[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Strange PVM spinlock case revisited (nailed down)



>>> On 14.02.13 at 12:04, Stefan Bader <stefan.bader@xxxxxxxxxxxxx> wrote:
> There was a bit more on the stack but try_to_wake_up seemed a believable 
> current
> path. Even more so after looking into the function:
> 
> #ifdef CONFIG_SMP
>         /*
>          * If the owning (remote) cpu is still in the middle of schedule() 
> with
>          * this task as prev, wait until its done referencing the task.
>          */
>         while (p->on_cpu) {
> #ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
>                 /*
>                  * In case the architecture enables interrupts in
>                  * context_switch(), we cannot busy wait, since that
>                  * would lead to deadlocks when an interrupt hits and
>                  * tries to wake up @prev. So bail and do a complete
>                  * remote wakeup.
>                  */
>                 if (ttwu_activate_remote(p, wake_flags))
>                         goto stat;
> #else
>                 cpu_relax();
> #endif
> 
> And prying out the task in question from the stack, it was one currently
> being accounted for VCPU 6 and on_cpu. VCPU 6 is one of the other waiters 
> for
> the runq lock of VCPU 1. Which would get picked up as soon as this callback 
> is
> done. Unfortunately we "stay awhile... stay forever!".

When analyzing a similar problem with our old PV ticket lock
implementation (and the interrupt re-enabling there when possible
before going into polling mode), it was precisely this open coded
lock construct that caused problems for us. Back then I didn't,
however, realize that this would even affect the simplistic byte
locks used by the pv-ops Xen code (or else I would have pointed
this out).

Being relatively certain that with our new implementation we don't
have any such problems, I can only recommend against dropping
the re-enabling of interrupts - this needlessly introduces high
interrupt latencies in a broader range of cases. Instead, the
interactions ought to be fixed properly. And it's time for using
ticket locks in the Xen code too...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.