[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 08 of 10 v2] libxl: enable automatic placement of guests on NUMA nodes
On Thu, 2012-06-21 at 17:34 +0100, Dario Faggioli wrote: > On Thu, 2012-06-21 at 12:40 +0100, Ian Campbell wrote: > > > +/* > > > + * This function tries to figure out if the host has a consistent number > > > + * of cpus along all its NUMA nodes. In fact, if that is the case, we can > > > + * calculate the minimum number of nodes needed for a domain by just > > > + * dividing its total number of vcpus by this value computed here. > > > + * However, we are not allowed to assume that all the nodes have the > > > + * same number of cpus. Therefore, in case discrepancies among different > > > + * nodes are found, this function just returns 0 and the caller needs > > > + * to sort out things in some other way. > > > > If the caller has to deal with this anyway why bother with this check > > and/or the other algorithm? > > > Mmm... I'm not sure I get what you mean here. Perhaps my commenting > about the whole thing was not so clear in the first place. > > The (George's :-)) idea is, if we know each node has 8 CPUs, and our > domain wants more than that, it does not make sense to start looking for > placement solutions with just one node. Actually, if the domain wants > fewer than 16 CPUs, starting looking from solutions with 2 nodes in them > should be the right thing. > > That being said, I think it's important we do not assume we won't ever > find some architecture with different number of CPUs in different nodes, > and that is what a zero return from that function means. > > What was that you thought not to be worthwhile? The comment said "this function just returns 0 and the caller needs to sort out things in some other way" so I was wondering -- well if the caller has some algorithm which can cope with this why not always use it. I think I later observed that if it returns zero the caller in this case does something basic instead, so I see how this makes sense now. > > > > +/* Get all the placement candidates satisfying some specific conditions > > > */ > > > +int libxl__get_numa_candidates(libxl__gc *gc, > > > + uint32_t min_free_memkb, int min_cpus, > > > + int min_nodes, int max_nodes, > > > + libxl__numa_candidate *cndts[], int > > > *nr_cndts) > > > +{ > > > + libxl__numa_candidate *new_cndts = NULL; > > > + libxl_cputopology *tinfo = NULL; > > > + libxl_numainfo *ninfo = NULL; > > > + libxl_bitmap nodemap; > > > + int nr_nodes, nr_cpus; > > > + int array_size, rc; > > > + > > > + /* Get platform info and prepare the map for testing the > > > combinations */ > > > + ninfo = libxl_get_numainfo(CTX, &nr_nodes); > > > + if (ninfo == NULL) > > > + return ERROR_FAIL; > > > + /* If we don't have at least 2 nodes, it is useless to proceed */ > > > > You don't want to return whichever of those 2 nodes meets the > > constraints? > > > I actually was meaning something like "hey, there's only _1_ node, go > for it!" :-P I misread sorry! > > > ... > > > + *cndts = new_cndts; > > > + out: > > > + libxl_bitmap_dispose(&nodemap); > > > > nodemap might not have been initialised? > > > Mmm... Right. So I really think I need an init function. :-( In general all libxl datastructures are supposed to have one, most of them are just memset(.., 0) > > > > + libxl_cputopology_list_free(tinfo, nr_cpus); > > > + libxl_numainfo_list_free(ninfo, nr_nodes); > > > > It looks like the nr_* may also not have been initialised, but maybe > > list_free is safe in that case since the associated array must > > necessarily be NULL? > > > Well, probably, but as they both do a "for (i=0;i<nr_*;i++)" I think > it's safer if I "=0" those twos, is that ok? I think it would be a good idea. The alternative is that anything which returns a list has to set nr == 0. > > > + /* Bring the best candidate in front of the list --> candidates[0] */ > > > + if (nr_candidates > 1) > > > + libxl__sort_numa_candidates(candidates, nr_candidates, > > > numa_cmpf); > > > > Is the start up cost of libxl__sort_numa_candidates significant enough > > to make that if worthwhile? > > > I'm not sure what you mean here, I'm afraid ... :-( Is the cost of sorting the 1 element list so high it is worth avoiding. Since the sort itself would be trivial the startup costs are what would dominate. > > > > + > > > + /* > > > + * Check if the domain has any CPU affinity. If not, try to build up > > > one. > > > + * In case numa_place_domain() find at least a suitable candidate, > > > it will > > > + * affect info->cpumap accordingly; if it does not, it just leaves it > > > + * as it is. This means (unless some weird error manifests) the > > > subsequent > > > + * call to libxl_set_vcpuaffinity_all() will do the actual placement, > > > + * whatever that turns out to be. > > > + */ > > > + if (libxl_bitmap_is_full(&info->cpumap)) { > > > + int rc = numa_place_domain(gc, info); > > > + if (rc) > > > + return rc; > > > > I'm not sure about this, if numa_place_domain fails for any reason would > > we be not better off logging and continuing without placement? We > > already do that explicitly in a couple of cases. > > > Well, actually, if it "fails" in the sense it finds no candidates or > does not manage in placing the domain, it returns 0, so we do right what > you're suggesting. I'm sure this is stated in some comment but I guess I > can repeat it here. If !rc, it means we stepped into some ERROR_FAIL or > something, which I think has to be handled like this, hasn't it? That makes sense. > Perhaps I can also add some logging about "successfully failing", i.e., > not getting into any error but not being able to place the domain. If > yes, that will happen _inside_ numa_place_domain() rather than here. Logging in that case seems wise in any case since it will have performance implications I think. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |