Xen project Mailing List

Re: [Xen-devel] [PATCH v6 00/10] vnuma introduction

Hi! Another new series! On Fri, Jul 18, 2014 at 01:49:59AM -0400, Elena Ufimtseva wrote: [...] > Current problems: > > Warning on CPU bringup on other node > > The cpus in guest wich belong to different NUMA nodes are configured > to chare same l2 cache and thus considered to be siblings and cannot > be on the same node. One can see following WARNING during the boot time: > > [ 0.022750] SMP alternatives: switching to SMP code > [ 0.004000] ------------[ cut here ]------------ > [ 0.004000] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/smpboot.c:303 > topology_sane.isra.8+0x67/0x79() > [ 0.004000] sched: CPU #1's smt-sibling CPU #0 is not on the same node! > [node: 1 != 0]. Ignoring dependency. > [ 0.004000] Modules linked in: > [ 0.004000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc8+ #43 > [ 0.004000] 0000000000000000 0000000000000009 ffffffff813df458 > ffff88007abe7e60 > [ 0.004000] ffffffff81048963 ffff88007abe7e70 ffffffff8102fb08 > ffffffff00000100 > [ 0.004000] 0000000000000001 ffff8800f6e13900 0000000000000000 > 000000000000b018 > [ 0.004000] Call Trace: > [ 0.004000] [<ffffffff813df458>] ? dump_stack+0x41/0x51 > [ 0.004000] [<ffffffff81048963>] ? warn_slowpath_common+0x78/0x90 > [ 0.004000] [<ffffffff8102fb08>] ? topology_sane.isra.8+0x67/0x79 > [ 0.004000] [<ffffffff81048a13>] ? warn_slowpath_fmt+0x45/0x4a > [ 0.004000] [<ffffffff8102fb08>] ? topology_sane.isra.8+0x67/0x79 > [ 0.004000] [<ffffffff8102fd2e>] ? set_cpu_sibling_map+0x1c9/0x3f7 > [ 0.004000] [<ffffffff81042146>] ? numa_add_cpu+0xa/0x18 > [ 0.004000] [<ffffffff8100b4e2>] ? cpu_bringup+0x50/0x8f > [ 0.004000] [<ffffffff8100b544>] ? cpu_bringup_and_idle+0x1d/0x28 > [ 0.004000] ---[ end trace 0e2e2fd5c7b76da5 ]--- > [ 0.035371] x86: Booted up 2 nodes, 2 CPUs > > The workaround is to specify cpuid in config file and not use SMT. But soon I > will come up > with some other acceptable solution. > I've also encountered this. I suspect that even if you disble SMT with cpuid in config file, the cpu topology in guest might still be wrong. What do hwloc-ls and lscpu show? Do you see any weird topology like one core belongs to one node while three belong to another? (I suspect not because your vcpus are already pinned to a specific node) What I did was to manipulate various "id"s in Linux kernel, so that I create a topology like 1 core : 1 cpu : 1 socket mapping. In that case guest scheduler won't be able to make any assumption on individual CPU sharing caches with each other. In any case we've already manipulated various ids of CPU0, I don't see it harm to manipulate other CPUs as well. Thoughts? P.S. I'm benchmarking your v5, tell me if you're interested in the result. Wei. (This patch should be applied to Linux and it's by no mean suitable for upstream as is) ---8<--- From be2b33088e521284c27d6a7679b652b688dba83d Mon Sep 17 00:00:00 2001 From: Wei Liu <wei.liu2@xxxxxxxxxx> Date: Tue, 17 Jun 2014 14:51:57 +0100 Subject: [PATCH] XXX: CPU topology hack! Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx> --- arch/x86/xen/smp.c | 17 +++++++++++++++++ arch/x86/xen/vnuma.c | 2 ++ 2 files changed, 19 insertions(+) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 7005974..89656fe 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -81,6 +81,15 @@ static void cpu_bringup(void) cpu = smp_processor_id(); smp_store_cpu_info(cpu); cpu_data(cpu).x86_max_cores = 1; + cpu_physical_id(cpu) = cpu; + cpu_data(cpu).phys_proc_id = cpu; + cpu_data(cpu).cpu_core_id = cpu; + cpu_data(cpu).initial_apicid = cpu; + cpu_data(cpu).apicid = cpu; + per_cpu(cpu_llc_id, cpu) = cpu; + if (numa_cpu_node(cpu) != NUMA_NO_NODE) + cpu_data(cpu).phys_proc_id = numa_cpu_node(cpu); + set_cpu_sibling_map(cpu); xen_setup_cpu_clockevents(); @@ -326,6 +335,14 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) smp_store_boot_cpu_info(); cpu_data(0).x86_max_cores = 1; + cpu_physical_id(0) = 0; + cpu_data(0).phys_proc_id = 0; + cpu_data(0).cpu_core_id = 0; + per_cpu(cpu_llc_id, cpu) = 0; + cpu_data(0).initial_apicid = 0; + cpu_data(0).apicid = 0; + if (numa_cpu_node(0) != NUMA_NO_NODE) + per_cpu(x86_cpu_to_node_map, 0) = numa_cpu_node(0); for_each_possible_cpu(i) { zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL); diff --git a/arch/x86/xen/vnuma.c b/arch/x86/xen/vnuma.c index a02f9c6..418ced2 100644 --- a/arch/x86/xen/vnuma.c +++ b/arch/x86/xen/vnuma.c @@ -81,7 +81,9 @@ int __init xen_numa_init(void) setup_nr_node_ids(); /* Setting the cpu, apicid to node */ for_each_cpu(cpu, cpu_possible_mask) { + /* Use cpu id as apicid */ set_apicid_to_node(cpu, cpu_to_node[cpu]); + cpu_data(cpu).initial_apicid = cpu; numa_set_node(cpu, cpu_to_node[cpu]); cpumask_set_cpu(cpu, node_to_cpumask_map[cpu_to_node[cpu]]); } -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.