[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 145796: tolerable FAIL - PUSHED



On Wed, 8 Jan 2020 at 21:40, osstest service owner
<osstest-admin@xxxxxxxxxxxxxx> wrote:
>
> flight 145796 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/145796/
>
> Failures :-/ but no regressions.
>
> Tests which are failing intermittently (not blocking):
>  test-amd64-amd64-xl-rtds    15 guest-saverestore fail in 145773 pass in 
> 145796
>  test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 16 
> guest-start/debianhvm.repeat fail in 145773 pass in 145796
>  test-armhf-armhf-xl-rtds     12 guest-start      fail in 145773 pass in 
> 145796

It looks like this test has been failing for a while (although not reliably).
I looked at  a few flights, the cause seems to be the same:

Jan  8 15:02:14.700784 (XEN) Assertion '!unit_on_replq(svc)' failed at
sched_rt.c:586
Jan  8 15:02:26.715030 (XEN) ----[ Xen-4.14-unstable  arm32  debug=y
Not tainted ]----
Jan  8 15:02:26.720756 (XEN) CPU:    1
Jan  8 15:02:26.722158 (XEN) PC:     0023a750
common/sched_rt.c#replq_insert+0x7c/0xcc
Jan  8 15:02:26.727851 (XEN) CPSR:   200300da MODE:Hypervisor
Jan  8 15:02:26.731334 (XEN)      R0: 002a51a4 R1: 400614a0 R2:
3d64b900 R3: 40061338
Jan  8 15:02:26.736830 (XEN)      R4: 400614a0 R5: 002a51a4 R6:
3cf1cbf0 R7: 000001cb
Jan  8 15:02:26.742600 (XEN)      R8: 4003d1b0 R9: 400614a8
R10:4003d1b0 R11:400ffe54 R12:400ffde4
Jan  8 15:02:26.749119 (XEN) HYP: SP: 400ffe2c LR: 0023b6e8
Jan  8 15:02:26.752296 (XEN)
Jan  8 15:02:26.753036 (XEN)   VTCR_EL2: 80003558
Jan  8 15:02:26.755479 (XEN)  VTTBR_EL2: 00020000bbff4000
Jan  8 15:02:26.758757 (XEN)
Jan  8 15:02:26.759366 (XEN)  SCTLR_EL2: 30cd187f
Jan  8 15:02:26.761755 (XEN)    HCR_EL2: 0078663f
Jan  8 15:02:26.764250 (XEN)  TTBR0_EL2: 00000000bc029000
Jan  8 15:02:26.767364 (XEN)
Jan  8 15:02:26.767980 (XEN)    ESR_EL2: 00000000
Jan  8 15:02:26.770485 (XEN)  HPFAR_EL2: 00030010
Jan  8 15:02:26.772795 (XEN)      HDFAR: e0800f00
Jan  8 15:02:26.775272 (XEN)      HIFAR: c0605744
Jan  8 15:02:26.777748 (XEN)
Jan  8 15:02:26.778505 (XEN) Xen stack trace from sp=400ffe2c:
Jan  8 15:02:26.781910 (XEN)    00000000 3cf1cbf0 400614a0 002a51a4
3cf1cbf0 000001cb 4003d1b0 6003005a
Jan  8 15:02:26.788991 (XEN)    400613f8 400ffe7c 0023b6e8 002f9300
4004c000 400613f8 3cf1cbf0 000001cb
Jan  8 15:02:26.796093 (XEN)    4003d1b0 6003005a 400613f8 400ffeac
00242988 4004c000 002425ac 40058000
Jan  8 15:02:26.803237 (XEN)    4004c000 4004f000 10f45000 10f45008
4004b080 40058000 60030013 400ffebc
Jan  8 15:02:26.810360 (XEN)    00209984 00000002 4004f000 400ffedc
0020eddc 0020caf8 db097cd4 00000020
Jan  8 15:02:26.817504 (XEN)    c13afbec 00000000 db15fd68 400ffee4
0020c9dc 400fff34 0020d5e8 4004e000
Jan  8 15:02:26.824615 (XEN)    00000000 400fff44 400fff44 00000002
00000000 4004e8fa 4004e8f4 400fff1c
Jan  8 15:02:26.831737 (XEN)    400fff1c 6003005a 0020caf8 400fff58
00000020 c13afbec 00000000 db15fd68
Jan  8 15:02:26.838798 (XEN)    60030013 400fff54 0026c150 c1204d08
c13afbec 00000000 00000000 00000000
Jan  8 15:02:26.845877 (XEN)    00000002 400fff58 002753b0 00000009
db097cd4 db173008 00000002 c1204d08
Jan  8 15:02:26.852986 (XEN)    00000000 00000002 c13afbec 00000000
db15fd68 60030013 db15fd3c 00000020
Jan  8 15:02:26.860044 (XEN)    ffffffff b6cdccb3 c0107ed0 a0030093
4a000ea1 be951568 c136edc0 c010d3a0
Jan  8 15:02:26.867171 (XEN)    db097cd0 c056c7f8 c136edcc c010d720
c136edd8 c010d7e0 00000000 00000000
Jan  8 15:02:26.874526 (XEN)    00000000 00000000 00000000 c136ede4
c136ede4 00030030 60070193 80030093
Jan  8 15:02:26.881450 (XEN)    60030193 00000000 00000000 00000000 00000001
Jan  8 15:02:26.886519 (XEN) Xen call trace:
Jan  8 15:02:26.888168 (XEN)    [<0023a750>]
common/sched_rt.c#replq_insert+0x7c/0xcc (PC)
Jan  8 15:02:26.894240 (XEN)    [<0023b6e8>]
common/sched_rt.c#rt_unit_wake+0xf4/0x274 (LR)
Jan  8 15:02:26.900246 (XEN)    [<0023b6e8>]
common/sched_rt.c#rt_unit_wake+0xf4/0x274
Jan  8 15:02:26.905775 (XEN)    [<00242988>] vcpu_wake+0x1e4/0x688
Jan  8 15:02:26.909743 (XEN)    [<00209984>] domain_unpause+0x64/0x84
Jan  8 15:02:26.913956 (XEN)    [<0020eddc>]
common/event_fifo.c#evtchn_fifo_unmask+0xd8/0xf0
Jan  8 15:02:26.920167 (XEN)    [<0020c9dc>] evtchn_unmask+0x7c/0xc0
Jan  8 15:02:26.924173 (XEN)    [<0020d5e8>] do_event_channel_op+0xaf0/0xdac
Jan  8 15:02:26.928922 (XEN)    [<0026c150>] do_trap_guest_sync+0x350/0x4d0
Jan  8 15:02:26.933647 (XEN)    [<002753b0>] entry.o#return_from_trap+0/0x4
Jan  8 15:02:26.938299 (XEN)
Jan  8 15:02:26.939039 (XEN)
Jan  8 15:02:26.939668 (XEN) ****************************************
Jan  8 15:02:26.943794 (XEN) Panic on CPU 1:
Jan  8 15:02:26.945872 (XEN) Assertion '!unit_on_replq(svc)' failed at
sched_rt.c:586
Jan  8 15:02:26.951492 (XEN) ****************************************

I believe the domain_unpause() is coming from guest_clear_bit(). This
would mean the atomics didn't succeed without pausing the domain. This
makes sense as, per the log:

 CPU1: Guest atomics will try 1 times before pausing the domain

I am under the impression that the crash could be reproduced with just:

domain_pause_nosync(current);
domain_unpause(current);

Any insights what's wrong? I am happy to try to reproduce it tomorrow morning.

Cheers,

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.