[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface

To: Dulloor <dulloor@xxxxxxxxx>
From: Andre Przywara <andre.przywara@xxxxxxx>
Date: Tue, 3 Aug 2010 23:55:29 +0200
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Tue, 03 Aug 2010 14:57:15 -0700
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Dulloor wrote:

On Tue, Aug 3, 2010 at 8:52 AM, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:

On 03/08/2010 16:43, "Dulloor" <dulloor@xxxxxxxxx> wrote:

I would expect guest would see nodes 0 to nr_vnodes-1, and the mnode_id
could go away.

mnode_id maps the vnode to a particular physical node. This will be
used by balloon driver in
the VMs when the structure is passed as NUMA enlightenment to PVs and
PV on HVMs.
I have a patch ready for that (once we are done with this series).

So what happens when the guest is migrated to another system with different
physical node ids? Is that never to be supported? I'm not sure why you
wouldn't hide the vnode-to-mnode translation in the hypervisor.


Right now, migration is not supported when NUMA strategy is set.
This is in my TODO list (along with PoD support).

There are a few open questions wrt migration :
- What if the destination host is not NUMA, but the guest is NUMA. Do we fake
those nodes ? Or, should we not select such a destination host to begin with.

I don't see a problem with this situation. The guest has virtual nodes,these can be mapped in any way to actual physical nodes (but only by thehypervisor/Dom0, not by the guest itself).A corner case could be clearly to map all guest nodes to one single hostnode. In terms of performance this should be even better, if the newhost can satisfy the requirement from one node, because there will be noremote accesses at all.

- What if the destination host is not NUMA, but guest has asked to be
striped across
a specific number of nodes (possibly for higher aggregate memory bandwidth) ?

Most people deal with NUMA because they want to cure performance _drops_caused by bad allocation policies. After all NUMA awareness is aperformance optimization. If the user asks to migrate to another host,then we shouldn't come with fussy argument like NUMA. In my eyes it is aquestion of priorities, I don't want to deny migration because of this.

- What if the guest has asked for a particular memory strategy
(split/confined/striped),
but the destination host can't guarantee that (because of the
distribution of free memory
across the nodes) ?

I see, there is one case where the new host has more nodes than the oldone, but the memory on each node is not sufficient (like migrating froma 2*8GB machine to an 8*4GB one). I think we should inform the userabout this and if she persists in the migration, use some kind ofinterleaving to join two (or more) nodes together. Looks like futurework, though.

Once we answer these questions, we will know whether vnode-to-mnode
translation is better
exposed or not. And, if exposed, could we just renegotiate the
vnode-to-mnode translation at the
destination host. I have started working on this. But, I have some
other patches ready to go
which we might want to check-in first - PV/Dom0 NUMA patches,
Ballooning support (see below).

As such, the purpose of vnode-to-mnode translation is for the enlightened
guests to know where their underlying memory comes from, so that
over-provisioning features
like ballooning are given a chance to maintain this distribution.

I was afraid you were saying that ;-) I haven't thought about this indetail, but maybe we can make an exception for Dom0 only, because thisis the most prominent and frequent user of ballooning. But I reallythink that DomUs should not know about or deal with host NUMA nodes.


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- RE: [xen-devel][vNUMA v2][PATCH 2/8] public interface
  - From: Dan Magenheimer
- Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
  - From: Keir Fraser

References:
- Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
  - From: Dulloor
- Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
  - From: Keir Fraser
- Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
  - From: Dulloor

Prev by Date: Re: [Xen-devel] HVM hypercalls
Next by Date: Re: [Xen-devel] (XEN) d0:v0: unhandled page fault (ec=0009)
Previous by thread: Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
Next by thread: Re: [xen-devel][vNUMA v2][PATCH 2/8] public interface
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.