[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [linux-linus test] 109469: regressions - FAIL



>>> On 17.05.17 at 16:59, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 05/16/2017 06:43 PM, osstest service owner wrote:
>> flight 109469 linux-linus real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/109469/ 
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-amd64-i386-libvirt       6 xen-boot                 fail REGR. vs. 
>> 109449
> 
> http://logs.test-lab.xenproject.org/osstest/logs/109469/test-amd64-i386-libv 
> irt/serial-rimava0.log
> 
> This looks like some sort of a deadlock with CPU2 waiting for remote
> call to complete while CPU0 waiting for flush_lock.

But these two don't block each other, as both run with interrupts
enabled (i.e. are available to process IPIs the other might have
sent).

> Only two CPUs are dumped though.

That's bad. CPU4 sitting at the final loop in flush_area_mask()
makes clear that's the flush_lock holder, but we can imply it has
IRQs on just like CPUs 0 and 2. While the place CPU2 was
caught also doesn't allow us to deduce which other CPU(s)
is/are not responding, the main candidate would appear to be
CPU1, of which we know nothing except that it also sits in
_spin_lock(). Neither flush_lock nor call_lock would ever be
acquired with IRQs off, so I'd conclude there must be a 3rd
lock involved here.

Therefore I'm afraid the only way we could obtain a more
complete picture would be if this re-occurred and if at that
time we'd have "async-show-all" in place on the hypervisor
command line.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.