[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] x86: correct socket_cpumask allocation



On Fri, 2015-07-10 at 16:13 +0100, Jan Beulich wrote:
> >>> On 10.07.15 at 16:57, <dario.faggioli@xxxxxxxxxx> wrote:

> > ...
> > (XEN) Preparing system for ACPI S5 state.
> > (XEN) Disabling non-boot CPUs ...
> > (XEN) Broke affinity for irq 9
> > (XEN)   cpu=1 cpu_to_socket=4294967295
> > (XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Tainted:    C ]----
> > (XEN) CPU:    0
> > (XEN) RIP:    e008:[<ffff82d0801886c2>] cpu_smpboot_free+0x43/0x28b
> > ...
> > 
> > I.e., it looks like phys_proc_id has already been reset to
> > XEN_INVALID_SOCKET_ID, as we're kind-of racing with
> > remove_siblinginfo().
> 
> Right. We have
> 
> cpu_down()
>   stop_machine_run(take_cpu_down, ...)
>     notifier_call_chain(&cpu_chain, CPU_DYING, ...)
>     __cpu_disable()
>       remove_siblinginfo()
>   __cpu_die()
> notifier_call_chain(&cpu_chain, CPU_DEAD, ...)
>   cpu_smpboot_free()
> 
> I.e. a clear use-after-invalidate.
> 
Exactly. I don't have a box with CAT, but on one, I expect similar
problems to happen in:

  psr_cpu_fini()
    cat_cpu_fini() --> unsigned int socket = cpu_to_socket(cpu);

as that also runs from CPU_DEAD. :-/

I guess this haven't seen any "let's shut the host down" kind of
testing...

Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.