[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [RFC] pv guest numa [RE: Host Numa informtion in dom0]



Hi Dulloor --

> I am in the process of making other places of dynamic memory
> mgmt/operations numa-aware - tmem, memory exchange operations, etc.

I'd be interested in your thoughts on numa-aware tmem
as well as the other dynamic memory mechanisms in Xen 4.0.

Tmem is special in that it uses primarily full-page copies
from/to tmem-space to/from guest-space so, assuming the
interconnect can pipeline/stream a memcpy, overhead of
off-node memory vs on-node memory should be less
noticeable.  However tmem uses large data structures
(rbtrees and radix-trees) and the lookup process might
benefit from being NUMA-aware.

Also, I will be looking into adding some page-sharing
techniques into tmem in the near future.  This (and the
existing page sharing feature just added to 4.0) may
create some other interesting challenges for NUMA-awareness.

Dan

> -----Original Message-----
> From: Dulloor [mailto:dulloor@xxxxxxxxx]
> Sent: Friday, February 12, 2010 11:25 PM
> To: Ian Pratt
> Cc: Andre Przywara; xen-devel@xxxxxxxxxxxxxxxxxxx; Nakajima, Jun; Keir
> Fraser
> Subject: [Xen-devel] [RFC] pv guest numa [RE: Host Numa informtion in
> dom0]
> 
> I am attaching (RFC) patches for NUMA-aware pv guests.
> 
> * The patch adds hypervisor interfaces to export minimal numa-related
> information about the memory of pv domain, which can then be used to
> setup the node ranges, virtual cpu<->node maps, and virtual slit
> tables in the pv domain.
> * The guest-domain also maintains a mapping between its vnodes and
> mnodes(actual machine nodes). These mappings can be used in the memory
> operations, such as those in ballooning.
> * In the patch, dom0 is made numa-aware using these interfaces. Other
> domains should be simpler. I am in the process of adding python
> interfaces for this. And, this would work with any node selection
> policy.
> * The patch is tested only for 64-on-64 (on x86_64)
> 
> * Along with the following other patches, this could provide a good
> solution for numa-aware guests -
> - numa-aware ballooning  (previously posted by me on xen-devel)
> - Andre's patch for HVM domains (posted by Andre recently)
> 
> I am in the process of making other places of dynamic memory
> mgmt/operations numa-aware - tmem, memory exchange operations, etc.
> 
> Please let know your comments.
> 
> -dulloor
> 
> On Thu, Feb 11, 2010 at 10:21 AM, Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>
> wrote:
> >> > If guest NUMA is disabled, we just use a single node mask which is
> the
> >> > union of the per-VCPU node masks.
> >> >
> >> > Where allowed node masks span more than one physical node, we
> should
> >> > allocate memory to the guest's virtual node by pseudo randomly
> striping
> >> > memory allocations (in 2MB chunks) from across the specified
> physical
> >> > nodes. [pseudo random is probably better than round robin]
> >>
> >> Do we really want to support this? I don't think the allowed node
> masks
> >> should span more than one physical NUMA node. We also need to look
> at I/O
> >> devices as well.
> >
> > Given that we definitely need this striping code in the case where
> the guest is non NUMA, I'd be inclined to still allow it to be used
> even if the guest has multiple NUMA nodes. It could come in handy where
> there is a hierarchy between physical NUMA nodes, enabling for example
> striping to be used between a pair of 'close' nodes, while exposing the
> higher-level topology of sets of the paired nodes to be exposed to the
> guest.
> >
> > Ian
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.