[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] introduce a cache options for PV disks



On Thu, 2013-06-27 at 17:09 +0100, Ian Jackson wrote:
> Ian Jackson writes ("Re: [PATCH v2] introduce a cache options for PV disks"):
> > To avoid these problems we intend to rename this option and change the
> > documentation accordingly.  Probably, to "direct-io-safe" (both in
> > Xenstore and the libxl API).  The documentation will say that this
> > option just disables any workaround where we avoid using direct IO
> > because of bugs.
> 
> Here's an RFC version of the libxl part of this patch.  It compiles
> but has not been tested.
> 
> For clarity the patch below doesn't include the rebuild of
> libxlu_disk_l.[ch].
> 
> Ian.
> 
> Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> 
> diff --git a/docs/misc/vbd-interface.txt b/docs/misc/vbd-interface.txt
> index 3952e73..f873db0 100644
> --- a/docs/misc/vbd-interface.txt
> +++ b/docs/misc/vbd-interface.txt
> @@ -125,3 +125,9 @@ because they directly map the bottom 8 bits of the 
> xenstore integer
>  directly to the Linux guest's device number and throw away the rest;
>  they can crash due to minor number clashes.  With these guests, the
>  workaround is not to supply problematic combinations of devices.
> +
> +
> +Other frontend and backend options
> +----------------------------------
> +
> +See xen/include/public/io/blkif.h for the full list of options.
> diff --git a/docs/misc/xl-disk-configuration.txt 
> b/docs/misc/xl-disk-configuration.txt
> index 5bd456d..9c2650b 100644
> --- a/docs/misc/xl-disk-configuration.txt
> +++ b/docs/misc/xl-disk-configuration.txt
> @@ -178,6 +178,44 @@ information to be interpreted by the executable program 
> <script>,
>  These scripts are normally called "block-<script>".
>  
> 
> +direct-io-safe
> +--------------
> +
> +Description:           Disables non-O_DIRECT workaround
> +Supported values:      absent, present
> +Mandatory:             No
> +Default value:         absent (workaround may be enabled)
> +
> +There is a memory lifetime bug in some driver domain (dom0) kernels
> +which can cause crashes when using O_DIRECT.  The bug occurs due to a
> +mismatch between the backend-visible lifetime of pages used for the
> +Xen PV network protocol and that expected by the backend kernel's
> +networking subsystem.  This can cause crashes when using certain
> +backends with certain underlying storage.
> +
> +See:
> + http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
> +
> +For this reason, (this version of) the Xen libxl toolstack disables
> +O_DIRECT when using the qemu-based Xen PV backend ("qdisk").
> +
> +However, this workaround has performance and scaling implications, and
> +it is only necessary if the underlying device is a network filesystem.
> +If the underlying device is not, then it is good to disable it; that
> +is what this option is for.
> +
> +This option simply requests that the workaround be disabled.  (However,
> +not all backends versions which use the workaround understand this
> +option, so this is on a best effort basis.)
> +
> +It's important to note that if you are storing the VM disk on a
> +network filesystem or a network block device (NFS or ISCSI) it might
> +not be safe to use this option.  Otherwise specifying it is safe and
> +can give better performances.
> +
> +If in the future the bug is fixed properly this option will then be
> +silently ignored.
> +
>  
>  ============================================
>  DEPRECATED PARAMETERS, PREFIXES AND SYNTAXES
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 3236aa9..541c258 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2168,6 +2168,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
> domid,
>          flexarray_append(back, disk->readwrite ? "w" : "r");
>          flexarray_append(back, "device-type");
>          flexarray_append(back, disk->is_cdrom ? "cdrom" : "disk");
> +        if (disk->direct_io_safe) {
> +            flexarray_append(back, "direct-io-safe");
> +            flexarray_append(back, "1");
> +        }
>  
>          flexarray_append(front, "backend-id");
>          flexarray_append(front, libxl__sprintf(gc, "%d", 
> disk->backend_domid));
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index d218a2d..892ab01 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -375,6 +375,7 @@ libxl_device_disk = Struct("device_disk", [
>      ("removable", integer),
>      ("readwrite", integer),
>      ("is_cdrom", integer),
> +    ("direct_io_safe", bool),

You'll want a #define LIBXL_HAVE_DEVICE_DISK_DIRECT_IO_SAFE (or
something more wieldy) in libxl.h to enable people to write code which
works against different versions of libxl.

Is this deliberately not a defbool?

>      ])
>  
>  libxl_device_nic = Struct("device_nic", [
> diff --git a/tools/libxl/libxlu_disk_l.l b/tools/libxl/libxlu_disk_l.l
> index 7c4e7f1..ba8577c 100644
> --- a/tools/libxl/libxlu_disk_l.l
> +++ b/tools/libxl/libxlu_disk_l.l
> @@ -173,6 +173,7 @@ backendtype=[^,]*,? { STRIP(','); 
> setbackendtype(DPC,FROMEQUALS); }
>  
>  vdev=[^,]*,? { STRIP(','); SAVESTRING("vdev", vdev, FROMEQUALS); }
>  script=[^,]*,?       { STRIP(','); SAVESTRING("script", script, FROMEQUALS); 
> }
> +direct-io-safe,? { DPC->disk->direct_io_safe = 1; }
>  
>   /* the target magic parameter, eats the rest of the string */
>  
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index b9b9d98..baf0f0a 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -97,6 +97,22 @@
>   *
>   *      The type of the backing device/object.
>   *
> + *
> + * direct-io-safe
> + *      Values:         1

0 (==direct-io is unsafe) is explicitly not a valid value?

> + *      Optional
> + *
> + *      The underlying storage is not affected by the direct IO memory
> + *      lifetime bug.  See:
> + *        http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
> + *
> + *      So use of O_DIRECT is safe, in circumstances

           ^Some ?

Otherwise I can't parse this sentence.

> + *      where we would normally have avoided it as a workaround for
> + *      that bug.  This option is not relevant for all backends, and
> + *      even not necessarily supported for those for which it is
> + *      relevant.  A backend which knows that it is not affected by
> + *      the bug can ignore this option.
> + *
>   *--------------------------------- Features 
> ---------------------------------
>   *
>   * feature-barrier



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.