[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/3] libxl: print message how to recover from xl cpupool-cpu-remove errors



On Thu, 2016-04-14 at 17:06 +0100, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH v4 2/3] libxl: print message how to
> recover from xl cpupool-cpu-remove errors"):
> > 
> > An error occurring when calling "xl cpupool-cpu-remove" might leave
> > the system in a state where a cpu is neither completely free nor in
> > a cpupool.
> Surely this is a bug.  Can it not be avoided ?
> 
Not easily (and in general not with any patch that I'd consider
appropriate for this phase of the release process), as it depends on
transient situations in the hypervisor, such as lock contention on
scheduling data structures.

> > This can easily be repaired by adding the cpu via
> > "xl cpupool-cpu-add" to the cpupool where it was removed from
> > before.
> > Print a message telling this the user in case of an error.
> ...
> > 
> > -    if (libxl_cpupool_cpuremove_cpumap(ctx, poolid, &cpumap))
> > -        fprintf(stderr, "some cpus may not have been removed from
> > %s\n", pool);
> > +    if (libxl_cpupool_cpuremove_cpumap(ctx, poolid, &cpumap)) {
> > +        fprintf(stderr, "Some cpus may have not or only partially
> > been removed from '%s'.\n", pool);
> > +        fprintf(stderr, "If a cpu can't be added to another
> > cpupool, add it to '%s' again and retry.\n", pool);
> > +    }
> If it can't be avoided then I guess this will have to do but I remain
> to be convinced.
> 
And in fact, it's not something that is introduced by this series,
which is, with this patch, just taking the chance to document things
better (although, this series introduces one more way for the issue to
occur).

Doing some retries at levels lower than this would minimize the chance
of the user actually getting to deal with the problem. For eaxmple,
what's done in libxc... but as you pointed out, that introduces other
problems, so I'm not sure. :-/

Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.