[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v10 01/13] x86: add socket_cpumask
>>> On 08.07.15 at 04:43, <chao.p.peng@xxxxxxxxxxxxxxx> wrote: > On Tue, Jul 07, 2015 at 06:32:55PM -0400, Boris Ostrovsky wrote: >> >@@ -245,6 +248,8 @@ static void set_cpu_sibling_map(int cpu) >> > cpumask_set_cpu(cpu, &cpu_sibling_setup_map); >> >+ cpumask_set_cpu(cpu, socket_cpumask[cpu_to_socket(cpu)]); >> >> This patch crashes Xen on my 32-cpu Intel box here for cpu 16, which is the >> first CPU on the second socket (i.e. on socket 1). >> >> The reason appears to be that cpu_to_socket(16) is (correctly) 1 here, but >> ... >> >> >+ >> > if ( c[cpu].x86_num_siblings > 1 ) >> > { >> > for_each_cpu ( i, &cpu_sibling_setup_map ) >> >@@ -649,7 +654,13 @@ void cpu_exit_clear(unsigned int cpu) >> > static void cpu_smpboot_free(unsigned int cpu) >> > { >> >- unsigned int order; >> >+ unsigned int order, socket = cpu_to_socket(cpu); >> >+ >> >+ if ( cpumask_empty(socket_cpumask[socket]) ) >> >+ { >> >+ free_cpumask_var(socket_cpumask[socket]); >> >+ socket_cpumask[socket] = NULL; >> >+ } >> > free_cpumask_var(per_cpu(cpu_sibling_mask, cpu)); >> > free_cpumask_var(per_cpu(cpu_core_mask, cpu)); >> >@@ -694,6 +705,7 @@ static int cpu_smpboot_alloc(unsigned int cpu) >> > nodeid_t node = cpu_to_node(cpu); >> > struct desc_struct *gdt; >> > unsigned long stub_page; >> >+ unsigned int socket = cpu_to_socket(cpu); >> >> ... is zero here, meaning that socket_cpumask[1] is NULL. I suspect that >> phys_proc_id is probably not set at this point but is by the time we get to >> set_cpu_sibling_map(). I haven't looked any further yet. I might do this >> tomorrow unless Chao does it before me. > > Thanks for testing. Boris' report first of all raises the question: Did you test this at all on a multi-socket system? Considering you not having tested the CPU removal case either, I'm starting to wonder how much testing this series has seen overall... > I think I have found the reason. For AP, phys_proc_id is set in: > start_secondary()=>smp_callin()=>smp_store_cpu_info()=>identify_cpu() > which is behind cpu_smpboot_alloc() called from CPU_PREPARE. > > One way would move 'zalloc_cpumask_var(socket_cpumask + socket)' to > set_cpu_sibling_map() to fix it if Jan agrees that, otherwise other > solution needs to be found. Looks sensible at a first glance, but in order to be able to do proper error handling the allocation needs to remain in cpu_smpboot_alloc(). I.e. you'd add a static variable, pre- allocate a cpumask into it if it's currently NULL, and consume the allocation in set_cpu_sibling_map() (or maybe even better in smp_store_cpu_info() right after the identify_cpu() call) if socket_cpumask[socket] is NULL. And then you test this on an affected system, and submit asap, so we can preferably avoid reverting the whole series. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |