[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] libxc: Document xc_domain_resume



On 29/02/2016 19:59, Konrad Rzeszutek Wilk wrote:
> Document the save and suspend mechanism.
>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> ---
>  tools/libxc/include/xenctrl.h | 52 
> +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 150d727..9778947 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -565,6 +565,58 @@ int xc_domain_destroy(xc_interface *xch,
>   * This function resumes a suspended domain. The domain should have
>   * been previously suspended.
>   *
> + * Note that there are 'xc_domain_suspend' as suspending a domain
> + * is quite the endeavour. As such this long comment will describe the
> + * suspend and resume path.

I am not sure this second sentence is useful.

> + *
> + * For the purpose of this explanation there are three guests:
> + * PV (using hypercalls for privilgied operations), HVM
> + * (fully hardware virtualized guests using emulated devices for everything),
> + * and PVHVM (hardware virtualized guest with PV drivers).

PV aware with hardware virtualisation.  It is perfectly possible to be
"PV aware" without having blkfront and netfront drivers.  I realise this
is a grey area, but "PV drivers" does tend to imply the blk/net
protocols rather than the full "PV awareness".

> + *
> + * HVM guest are the simplest - they suspend via S3 and resume from
> + * S3. Upon resume they have to re-negotiate with the emulated devices.

And S4.

> + *
> + * PV and PVHVM communate via via hypercalls for suspend (and resume).
> + * For suspend the toolstack initiaties the process by writting an value in
> + * XenBus "control/shutdown" with the string "suspend".

I feel it is worth commenting about the stupidity of this protocol
whereby the ack mechanism is to clear the key, and the only reject/fail
mechanism is to leave the key unmodified and wait for the toolstack to
timeout.  (Similarly memory/target for ballooning.)

> + *
> + * The PV guest stashes anything it deems neccessary in 'struct start_info'
> + * in case of failure (PVHVM may ignore this) and calls the

What do you mean for the failure case here?

> + * SCHEDOP_shutdown::SHUTDOWN_suspend  hypercall (for PV as argument it
> + * passes the MFN to 'struct start_info').
> + *
> + * And then the guest is suspended.
> + *
> + * At this point the guest may be resumed on the same host under the same
> + * domain (checkpointing or suspending failed), or on a different host.

Slightly misleading.

The guest may be resumed in the same domain (in which case domid is the
same and all gubbins are still in place), or in a new domain; likely a
different domid, possibly a different host (but not impossible to switch
host and retain the same numeric domid) at which point all gubbins are lost.

> + *
> + * The checkpointing or notifying an guest that the suspend failed is by

"a guest"

> + * having the SCHEDOP_shutdown::SHUTDOWN_suspend hypercall return a non-zero
> + * value.

Do we have to document it as "suspend failed"?  In the case of a
checkpoint, it really isn't a failure.

> + *
> + * The PV and PVHVM resume path are similar. For PV it would be similar to 
> bootup
> + * - figure out where the 'struct start_info' is (or if the suspend was
> + * cancelled aka checkpointed - reuse the saved values).

PV isn't similar to boot.

On boot, PV guests get start_info %rsi (or %esi) from the domain
builder.  In the case of suspend (failed or otherwise), start_info is in
%rdx (or %edx), (mutated as applicable by the safe/restore logic).

For HVM, there is no start info relevant for suspend/resume.

~Andrew

> + *
> + * From here on they differ depending whether the guest is PV or PVHVM
> + * in specifics but follow overall the same path:
> + *  - PV: Bringing up the vCPUS,
> + *  - PVHVM: Setup vector callback,
> + *  - Bring up vCPU runstates,
> + *  - Remap the grant tables if checkpointing or setup from scratch,
> + *
> + *
> + * If the resume was not checkpointing (or if suspend was succesful) we would
> + * setup the PV timers and the different PV events. Lastly the PV drivers
> + * re-negotiate with the backend.
> + *
> + * This function would return before the guest started resuming. That is
> + * the guest would be in non-running state and its vCPU context would be
> + * in the the SCHEDOP_shutdown::SHUTDOWN_suspend hypercall return path
> + * (for PV and PVHVM). For HVM it would be in would be in QEMU emulated
> + * BIOS handling S3 suspend.
> + *
>   * @parm xch a handle to an open hypervisor interface
>   * @parm domid the domain id to resume
>   * @parm fast use cooperative resume (guest must support this)


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.