|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Strange PVM spinlock case revisited
On 11.02.2013 18:29, Ian Campbell wrote:
> An interesting hack^Wexperiment might be to make xen_poll_irq use a
> timeout and see if that unwedges things -- this would help confirm that
> the issue is on nested wakeup.
>
So I did go forward and replaced xen_poll_irq by xen_poll_irq_timeout and it did
get rid of the hang. Though I think there is a big taint there. There was
only one other user of poll_irq_timeout in the kernel code. And that uses
"jiffies + <timeout>*HZ". But when I look at the Xen side in do_poll, that looks
like it is using timeout in a absolute "ns since boot" (of hv/dom0) way. Not
sure how that ever can work. The ns since boot in the guest clearly is always
behind the host (and jiffies isn't ns either).
Effectively I likely got rid of any wait time in the hypervisor and back to
mostly spinning. Which matches the experience that the test run never gets stuck
waiting for a timeout. That maybe proves the stacking is an issue but also is
likely a bit too aggressive in not having any... :/
I will try to think of some better way. Not sure the thinking is realistic but
maybe that could happen:
xen_spin_lock_slow(a)
...
enables irq and upcalls are pending
upcall processing wants lock b
xen_spin_lock_slow(b)
--- just before replacing lock_spinners ---
xen_spin_unlock_slow(a)
finds other vcpu, triggers
IRQ
lock b is top spinner
going into poll_irq
poll_irq returns
lock a gets restored
so maybe no spinners on b
dropping out to xen_spin_lock
unlock of b not finding any
spinners
lock b acquired
That way the irq for lock a maybe get lost...
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |