[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 06/24] libxl: introduce vNUMA types

On Mon, 2015-02-16 at 15:17 +0000, Wei Liu wrote:
> On Mon, Feb 16, 2015 at 02:58:32PM +0000, Dario Faggioli wrote:

> > > +libxl_vnode_info = Struct("vnode_info", [
> > > +    ("memkb", MemKB),
> > > +    ("distances", Array(uint32, "num_distances")), # distances from this 
> > > node to other nodes
> > > +    ("pnode", uint32), # physical node of this node
> > >
> > I am unsure whether we ever discussed this already or not (and sorry for
> > not recalling) but, in principle, one vnode can be mapped to more than
> > just one pnode.
> > 
> I don't recall either.
> > Semantic would be that the memory of the vnode is somehow split (evenly,
> > by default, I would say) between the specified pnodes. So, pnode could
> > be a bitmap too (and be called "pnodes" :-) ), although we can put
> > checks in place that --for now-- it always have only one bit set.
> > 
> > Reasons might be that the user just wants it, or that there is not
> > enough (free) memory on just one pnode, but we still want to achieve
> > some locality.
> > 
> Wouldn't this cause unpredictable performance? 
A certain amount of it, yes, for sure, but always less than having the
memory striped on all nodes, I would say.

Well, of course it depends on how it will be used, as usual with these

> And there is no way to
> specify priority among the group of nodes you specify with a single
> bitmap.
Why do we need such a thing as a 'priority'? What I'm talking about is
making it possible, for each vnode, to specify vnode-to-pnode mapping as
a bitmap of pnode. What we'd do, in presence of a bitmap, would be
allocating the memory by striping it across _all_ the pnodes present in
the bitmap.

If there's only one bit set, you have the same behavior as in this

> I can't say I fully understand the implication of the scenario you
> described.
Ok. Imagine you want to create a guest with 2 vnodes, 4GB RAM total, so
2GB on each vnode. On the host, you have 8 pnodes, but only 1GB free on
each of them.

If you can only associate a vnode with a single pnode, there is no node
that can accommodate a full vnode, and we would have to give up trying
to place the domain and map the vnodes, and we'll end up with 0.5GB on
each pnode, unpredictable perf, and, basically, no vnuma at all (or at
least no vnode-to-pnode mapping)... Does this make sense?

If we allow the user (or the automatic placement algorithm) to specify a
bitmap of pnode for each vnode, he could put, say, vnode #1 on pnode #0
and #2, which maybe are really close (in terms of NUMA distances) to
each other, and vnode #2 to pnode #5 and #6 (close to each others too).
This would give worst performance than having each vnode on just one
pnode, but, most likely, better performance than the scenario described
right above.

Hope I made myself clear enough :-)


Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.