[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result



On Fri, 2016-04-15 at 07:35 +0200, Juergen Gross wrote:
> On 14/04/16 19:07, Ian Jackson wrote:
>
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -560,6 +560,34 @@ struct xen_sysctl_cpupool_op {
>
> > + * In this case the operation may have been partially carried out
> > and
> > + * the pcpu is left in an anomalous state.  In this state the pcpu
> > may
> > + * be used by some not readily predictable subset of the vcpus
> > + * (domains) whose vcpus are in the old cpupool.  (xxx is this
> > true?)
> Yes.
> 
Well, I think it's a little less bad than that. the pcpu is no longer a
valid member of its cpupool. This means the scheduler will not run
vcpus of the domains of the pool (it's domains that are in cpupools)
any longer. And even the vcpus that are temporarily stuck on the pcpu
(i.e., the ones that are actually preventing the operation to succeed)
will rather quickly get away from it.

So, I'd say rather that "In this state the pcpu will, potentially after
a short transitory, just not be used by any vcpu"

> > + *
> > + * This can be detected by seeing whether the pcpu can be added to
> > a
> > + * different cpupool.  (xxx this is a silly interface; the
> > situation
> > + * should be reported by a different errno value, at least.)  If
> > the
> I'm open for suggestions. :-)
> 
Well, changing the system behavior with anything that involves retry
loops is indeed non-trivial (or, at least, I don't out of the top of my
head have an alternative).

However, a different return value for the super special case of
temporary pinning override could maybe be selected. I'm having a look,
and although I don't find one that could be seen as a perfect fit (that
would be EBUSY, which is taken!), what about one of these:

  EEXIST        17      /* File exists */
  ENOTEMPTY     39      /* Directory not empty */
  EROFS         30      /* Read-only file system */
  ENOSPC        28      /* No space left on device */

After all, as ugly as the resulting error message would be:
 - if's a very rare special case
 - we need to document this very rare special case anyway
 - we can log more info about the actual error

> > + * pcpu can't be added to a different cpupool for this reason,
> > + * attempts to do so will returning (xxx what errno value?)
> You might have guessed: EBUSY
> 
And here too, if the pcpu is not in cpupool_free_cpus, maybe we can go
and check whether it is in any of the existing cpupool's cpu_valid. If
it is, then EBUSY is ok. If not, it means it's in the inconsistent
state, and we can pick another (same as above?) error value, use it and
document the situation.

> > + *
> > + * The anomalous situation can be recovered by adding the pcpu
> > back to
> > + * the cpupool it came from (xxx this permits a buggy or malicious
> > + * guest to prevent the cpu ever being removed from its cpupool).
> Only the hardware domain, which has other means to inject problems.
> 
Yep, indeed.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.