[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest
On 07/22/2015 10:50 AM, Dario Faggioli wrote: On Wed, 2015-07-22 at 16:09 +0200, Juergen Gross wrote:On 07/22/2015 03:58 PM, Boris Ostrovsky wrote:What if I configure a guest to follow HW topology? I.e. I pin VCPUs to appropriate cores/threads? With elfnote I am stuck with disabled topology.Add an option to do exactly that: follow HW topology (pin vcpus, configure vnuma)?I thought about configuring things in such a way that they match the host topology, as Boris is suggesting, too. And in that case, I think arranging for doing so in toolstack, if PV vNUMA is identified (as I think Juergen is suggesting) seems a good approach. However, when I try to do that on my box, manually, but I don't seem to be able to. Here's what I tried. Since I have this host topology: cpu_topology : cpu: core socket node 0: 0 1 0 1: 0 1 0 2: 1 1 0 3: 1 1 0 4: 9 1 0 5: 9 1 0 6: 10 1 0 7: 10 1 0 8: 0 0 1 9: 0 0 1 10: 1 0 1 11: 1 0 1 12: 9 0 1 13: 9 0 1 14: 10 0 1 15: 10 0 1 I configured the guest like this: vcpus = '4' memory = '1024' vnuma = [ [ "pnode=0","size=512","vcpus=0-1","vdistances=10,20" ], [ "pnode=1","size=512","vcpus=2-3","vdistances=20,10" ] ] cpus=["0","1","8","9"] This means vcpus 0 and 1, which are assigned to vnode 0, are pinned to pcpu 0 and 1, which are siblings, per the host topology. Similarly, vcpus 2 and 3, assigned to vnode 1, are assigned to two siblings pcpus on pnode 1. This seems to be honoured: # xl vcpu-list 4 Name ID VCPU CPU State Time(s) Affinity (Hard / Soft) test 4 0 0 -b- 10.9 0 / 0-7 test 4 1 1 -b- 7.6 1 / 0-7 test 4 2 8 -b- 0.1 8 / 8-15 test 4 3 9 -b- 0.1 9 / 8-15 And yet, no joy: # ssh root@xxxxxxxxxxxxx "yes > /dev/null 2>&1 &" # ssh root@xxxxxxxxxxxxx "yes > /dev/null 2>&1 &" # ssh root@xxxxxxxxxxxxx "yes > /dev/null 2>&1 &" # ssh root@xxxxxxxxxxxxx "yes > /dev/null 2>&1 &" # xl vcpu-list 4 Name ID VCPU CPU State Time(s) Affinity (Hard / Soft) test 4 0 0 r-- 16.4 0 / 0-7 test 4 1 1 r-- 12.5 1 / 0-7 test 4 2 8 -b- 0.2 8 / 8-15 test 4 3 9 -b- 0.1 9 / 8-15 So, what am I doing wrong at "following the hw topology"?Besides, this is not necessarily a NUMA-only issue, it's a scheduling one (inside the guest) as well.Sure. That's what Jan said regarding SUSE's xen-kernel. No toplogy info (or a trivial one) might be better than the wrong one...Yep. Exacty. As Boris says, this is a generic scheduling issue, although it's tru that it's only (as far as I can tell) with vNUMA that it bite us so hard... I am not sure that it's only vNUMA. It's just that with vNUMA we can see a warning (on your system) that something goes wrong. In other cases (like scheduling, or sizing objects based on discovered cache sizes) we don't see anything in the log but system/programs are making wrong decisions. (And your results above may well be the example of that) -boris I mean, performance are always going to be inconsistent, but it's only in that case that you basically _loose_ some of the vcpus! :-O Dario _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |