|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/4] xen: x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown
On Thu, 2015-06-25 at 16:52 +0100, Andrew Cooper wrote:
> On 25/06/15 16:04, Dario Faggioli wrote:
> > On Thu, 2015-06-25 at 15:20 +0100, Andrew Cooper wrote:
> >> On 25/06/15 13:15, Dario Faggioli wrote:
> >>> # xl cpupool-cpu-remove Pool-0 8-15
> >>> # xl cpupool-create name=\"Pool-1\"
> >>> # xl cpupool-cpu-add Pool-1 8-15
> >>> --> suspend
> >>> --> resume
> >>> (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]----
> >>> (XEN) CPU: 8
> >>> (XEN) RIP: e008:[<ffff82d080123078>] csched_schedule+0x4be/0xb97
> >>> (XEN) RFLAGS: 0000000000010087 CONTEXT: hypervisor
> >>> (XEN) rax: 80007d2f7fccb780 rbx: 0000000000000009 rcx:
> >>> 0000000000000000
> >>> (XEN) rdx: ffff82d08031ed40 rsi: ffff82d080334980 rdi:
> >>> 0000000000000000
> >>> (XEN) rbp: ffff83010000fe20 rsp: ffff83010000fd40 r8:
> >>> 0000000000000004
> >>> (XEN) r9: 0000ffff0000ffff r10: 00ff00ff00ff00ff r11:
> >>> 0f0f0f0f0f0f0f0f
> >>> (XEN) r12: ffff8303191ea870 r13: ffff8303226aadf0 r14:
> >>> 0000000000000009
> >>> (XEN) r15: 0000000000000008 cr0: 000000008005003b cr4:
> >>> 00000000000026f0
> >>> (XEN) cr3: 00000000dba9d000 cr2: 0000000000000000
> >>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> >>> (XEN) ... ... ...
> >>> (XEN) Xen call trace:
> >>> (XEN) [<ffff82d080123078>] csched_schedule+0x4be/0xb97
> >>> (XEN) [<ffff82d08012c732>] schedule+0x12a/0x63c
> >>> (XEN) [<ffff82d08012f8c8>] __do_softirq+0x82/0x8d
> >>> (XEN) [<ffff82d08012f920>] do_softirq+0x13/0x15
> >>> (XEN) [<ffff82d080164791>] idle_loop+0x5b/0x6b
> >>> (XEN)
> >>> (XEN) ****************************************
> >>> (XEN) Panic on CPU 8:
> >>> (XEN) GENERAL PROTECTION FAULT
> >>> (XEN) [error_code=0000]
> >>> (XEN) ****************************************
> >> What is the actual cause of the #GP fault? There are no obviously
> >> poised registers.
> >>
> > do
> > {
> > /*
> > * Get ahold of the scheduler lock for this peer CPU.
> > *
> > * Note: We don't spin on this lock but simply try it.
> > Spinning
> > * could cause a deadlock if the peer CPU is also load
> > * balancing and trying to lock this CPU.
> > */
> > spinlock_t *lock = pcpu_schedule_trylock(peer_cpu);
> >
> > We therefore enter the inner do{}while with, for instance (that's what
> > I've seen in my debugging), peer_cpu=9, but we've not yet done
> > cpu_schedule_up()-->alloc_pdata()-->etc. for that CPU, so we die at (or
> > shortly after) the end of the code snippet shown above.
>
> Aah - it is a dereference with %rax as a pointer, which is
>
> #define INVALID_PERCPU_AREA (0x8000000000000000L - (long)__per_cpu_start)
>
Exactly!
> That explains the #GP fault which is due to a non-canonical address.
>
> It might be better to use 0xDEAD000000000000L as the constant to make it
> slightly easier to spot as a poisoned pointer.
>
Indeed. :-)
> > I can try to think at it and to come up with something if you think it's
> > important...
>
> Not to worry. I was more concerned about working out why it was dying
> with an otherwise unqualified #GP fault.
>
Ok, thanks. So, just to clarify things to me, from your side, this patch
needs "just" a better changelog, right?
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |