[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1 of 3 v5/leftover] libxl: enable automatic placement of guests on NUMA nodes [and 1 more messages]



On Wed, 2012-07-18 at 14:14 +0100, Ian Campbell wrote:
> On Wed, 2012-07-18 at 12:00 +0100, Ian Jackson wrote:
> > An upper bound on log(number_of_combinations) is
> > log(number_of_nodes_on_the_host) * number_of_nodes_for_the_guest.
> > This fact could be used to determine more accurately whether the
> > algorithm is going to terminate in a reasonable time.
> 
> This seems like something which could be done for 4.2.0 and is better
> than the more static limit and better than no NUMA at all. Dario can you
> do something for this soon, as in this week or not later than next?
> 
Ok, I think I can do something like the below.

Remember there is an external cycle (the while{}) with
i:=[min_nodes...max_nodes] (with, typically, min_nodes=1 and
max_nodes=nr_ndoes). Nested in it, there is another cycle (the for{}),
performing exactly C(nr_nodes i) steps. So:

 1) I'll killing the need for storing the full list of combinations and 
    I'll run the comparisons on-line, so that I always have the best 
    temporary placement candidate, and I can return it anytime;

 2) at the beginning of each step of the for{} (the inner cycle) I
    check if the overall number of steps exceeds a given threshold (I
    was thinking to 2^18=262144) and if it does I just stop, no matter
    whether or not I have a solution;

The --unlikely but-- worst that can happen is the algorithm runs for a
while and at some point it terminates with a suboptimal candidate or
with nothing at all.
However, please consider that placement on an 8 nodes host requires 256
steps, on 16 nodes it takes 65536 steps. Thus, bad things would only
happen on systems with nr_nodes>16, which we now know will probably
never exist! :-)

As an alternative, after thinking quite a bit about it, I found a way to
produce a reasonably good estimation of (an upper bound of) the total
number of steps (as the log(n)*p suggested above is nice, but very very
far from being precise enough, I'm afraid). So, using that to know in
advance whether or not we should run is possible, but calculating it
requires some quite advanced math that I'm not very keen on implementing
here, as it will be ugly and not so easy to understand and maintain
(although I'd do my best in providing good comments :-)). OTOH, doing
like I described above enables the possibility of finding a (maybe
suboptimal) placement in a reasonable amount of time even on those
theoretic large systems, which is something good, I think.

> > Also when this algorithm would be used, but would take too long, we
> > should print a warning which tells the user they should use cpupools
> > to assign nodes directly.
>
> Agreed.
> 
This is definitely possible, although doing it _in_advance_ and being
precise enough would involve the advanced math I was talking about
before.

I can print the warning on-line, as soon as the estimated number of
steps reaches some threshold (lower that the one used to stop the
algorithm. I was thinking at 2^16=65536), would that be reasonable? I of
course can also print the warning in advance basing on the response of
some rough estimation, but I fear it would not be that useful then...

I'm implementing the thing just right now. If you have any feedback,
please shout it loud, and the sooner the better. :-D

Thanks a lot and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.