[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 33690: regressions - FAIL



On 01/26/2015 09:56 AM, Andrew Cooper wrote:
On 26/01/15 14:51, Boris Ostrovsky wrote:
On 01/26/2015 09:49 AM, Andrew Cooper wrote:
On 26/01/15 11:38, Jan Beulich wrote:
On 26.01.15 at 12:04, <JBeulich@xxxxxxxx> wrote:
On 24.01.15 at 13:54, <Ian.Jackson@xxxxxxxxxxxxx> wrote:
   test-amd64-amd64-xl-qemut-win7-amd64  7 windows-install   fail
REGR. vs. 33637
Jan 24 00:35:16.262627 (XEN) ----[ Xen-4.6-unstable  x86_64
debug=y  Not tainted ]----
Jan 24 00:35:16.478599 (XEN) CPU:    1
Jan 24 00:35:16.478624 (XEN) RIP:    e008:[<0000000000000000>]
0000000000000000
Jan 24 00:35:16.486596 (XEN) RFLAGS: 0000000000010082   CONTEXT:
hypervisor
...
Jan 24 00:35:16.678620 (XEN) Xen call trace:
Jan 24 00:35:16.678650 (XEN)    [<ffff82d0801d36d0>]
vpmu_do_interrupt+0x2f/0x8a
Jan 24 00:35:16.686605 (XEN)    [<ffff82d08015e242>]
pmu_apic_interrupt+0x33/0x35
Jan 24 00:35:16.698582 (XEN)    [<ffff82d080171bf0>] do_IRQ+0x9c/0x624
Jan 24 00:35:16.698615 (XEN)    [<ffff82d080234062>]
common_interrupt+0x62/0x70
Jan 24 00:35:16.698653 (XEN)    [<ffff82d08012c6fe>]
_spin_unlock_irq+0x30/0x31
Jan 24 00:35:16.706604 (XEN)    [<ffff82d08012bcf1>]
__do_softirq+0x81/0x8c
Jan 24 00:35:16.706638 (XEN)    [<ffff82d08012bd49>]
do_softirq+0x13/0x15
Jan 24 00:35:16.718591 (XEN)    [<ffff82d0801ec4da>]
vmx_asm_do_vmentry+0x2a/0x50
I think I see what the problem here is: Commit 8097616fbd
("x86/VPMU: handle APIC_LVTPC accesses") gives the guest
control over LVTPC.mask regardless of whether the vPMU was
actually initialized for it. Supposedly in the case above the
guest is being run with core2_no_vpmu_ops, which in
particular has .do_interrupt == NULL. It's not immediately
clear whether vpmu_lvtpc_update() should do the check or its
(sole) caller. In any event I'm going to revert that commit as
the primary suspect for causing the regression.
I have just fallen over this as well.  I second a revert in the absence
of a clear way to fix the patch.
I can't reproduce this -- neither at this patch level nor at full series.

Yes, we can test for do_interrupt presence in vpmu_lvtpc_update() (or
in vpmu_interrupt() itself) but since we cannot arm the counters
(there is no do_wrmsr op) I am not sure I understand what can trigger
this interrupt.

-boris


As Jan explained, The patch in question allows guests (windows in both
problematic cases) to arm LVTPC, with a vpmu instance with a NULL
pointer for do_interrupt.

Right, I understand that. But you'd need to arm the counters (not just APIC) for the interrupt to happen, wouldn't you? And I don't see how that can happen.

This issue also appears to happen on debian, so it's not Windows only.

In any case, you should indeed revert this until I resend a safer patch. Before I do that I'd like to be able to reproduce this though.

-boris


When a pmu apic interrupt arrives, the interrupt handler dies from a
NULL function pointer dereference.

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.