[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] introduce a cache options for PV disks
On Thu, 2013-06-27 at 17:09 +0100, Ian Jackson wrote: > Ian Jackson writes ("Re: [PATCH v2] introduce a cache options for PV disks"): > > To avoid these problems we intend to rename this option and change the > > documentation accordingly. Probably, to "direct-io-safe" (both in > > Xenstore and the libxl API). The documentation will say that this > > option just disables any workaround where we avoid using direct IO > > because of bugs. > > Here's an RFC version of the libxl part of this patch. It compiles > but has not been tested. > > For clarity the patch below doesn't include the rebuild of > libxlu_disk_l.[ch]. > > Ian. > > Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > > diff --git a/docs/misc/vbd-interface.txt b/docs/misc/vbd-interface.txt > index 3952e73..f873db0 100644 > --- a/docs/misc/vbd-interface.txt > +++ b/docs/misc/vbd-interface.txt > @@ -125,3 +125,9 @@ because they directly map the bottom 8 bits of the > xenstore integer > directly to the Linux guest's device number and throw away the rest; > they can crash due to minor number clashes. With these guests, the > workaround is not to supply problematic combinations of devices. > + > + > +Other frontend and backend options > +---------------------------------- > + > +See xen/include/public/io/blkif.h for the full list of options. > diff --git a/docs/misc/xl-disk-configuration.txt > b/docs/misc/xl-disk-configuration.txt > index 5bd456d..9c2650b 100644 > --- a/docs/misc/xl-disk-configuration.txt > +++ b/docs/misc/xl-disk-configuration.txt > @@ -178,6 +178,44 @@ information to be interpreted by the executable program > <script>, > These scripts are normally called "block-<script>". > > > +direct-io-safe > +-------------- > + > +Description: Disables non-O_DIRECT workaround > +Supported values: absent, present > +Mandatory: No > +Default value: absent (workaround may be enabled) > + > +There is a memory lifetime bug in some driver domain (dom0) kernels > +which can cause crashes when using O_DIRECT. The bug occurs due to a > +mismatch between the backend-visible lifetime of pages used for the > +Xen PV network protocol and that expected by the backend kernel's > +networking subsystem. This can cause crashes when using certain > +backends with certain underlying storage. > + > +See: > + http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html > + > +For this reason, (this version of) the Xen libxl toolstack disables > +O_DIRECT when using the qemu-based Xen PV backend ("qdisk"). > + > +However, this workaround has performance and scaling implications, and > +it is only necessary if the underlying device is a network filesystem. > +If the underlying device is not, then it is good to disable it; that > +is what this option is for. > + > +This option simply requests that the workaround be disabled. (However, > +not all backends versions which use the workaround understand this > +option, so this is on a best effort basis.) > + > +It's important to note that if you are storing the VM disk on a > +network filesystem or a network block device (NFS or ISCSI) it might > +not be safe to use this option. Otherwise specifying it is safe and > +can give better performances. > + > +If in the future the bug is fixed properly this option will then be > +silently ignored. > + > > ============================================ > DEPRECATED PARAMETERS, PREFIXES AND SYNTAXES > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > index 3236aa9..541c258 100644 > --- a/tools/libxl/libxl.c > +++ b/tools/libxl/libxl.c > @@ -2168,6 +2168,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t > domid, > flexarray_append(back, disk->readwrite ? "w" : "r"); > flexarray_append(back, "device-type"); > flexarray_append(back, disk->is_cdrom ? "cdrom" : "disk"); > + if (disk->direct_io_safe) { > + flexarray_append(back, "direct-io-safe"); > + flexarray_append(back, "1"); > + } > > flexarray_append(front, "backend-id"); > flexarray_append(front, libxl__sprintf(gc, "%d", > disk->backend_domid)); > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > index d218a2d..892ab01 100644 > --- a/tools/libxl/libxl_types.idl > +++ b/tools/libxl/libxl_types.idl > @@ -375,6 +375,7 @@ libxl_device_disk = Struct("device_disk", [ > ("removable", integer), > ("readwrite", integer), > ("is_cdrom", integer), > + ("direct_io_safe", bool), You'll want a #define LIBXL_HAVE_DEVICE_DISK_DIRECT_IO_SAFE (or something more wieldy) in libxl.h to enable people to write code which works against different versions of libxl. Is this deliberately not a defbool? > ]) > > libxl_device_nic = Struct("device_nic", [ > diff --git a/tools/libxl/libxlu_disk_l.l b/tools/libxl/libxlu_disk_l.l > index 7c4e7f1..ba8577c 100644 > --- a/tools/libxl/libxlu_disk_l.l > +++ b/tools/libxl/libxlu_disk_l.l > @@ -173,6 +173,7 @@ backendtype=[^,]*,? { STRIP(','); > setbackendtype(DPC,FROMEQUALS); } > > vdev=[^,]*,? { STRIP(','); SAVESTRING("vdev", vdev, FROMEQUALS); } > script=[^,]*,? { STRIP(','); SAVESTRING("script", script, FROMEQUALS); > } > +direct-io-safe,? { DPC->disk->direct_io_safe = 1; } > > /* the target magic parameter, eats the rest of the string */ > > diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h > index b9b9d98..baf0f0a 100644 > --- a/xen/include/public/io/blkif.h > +++ b/xen/include/public/io/blkif.h > @@ -97,6 +97,22 @@ > * > * The type of the backing device/object. > * > + * > + * direct-io-safe > + * Values: 1 0 (==direct-io is unsafe) is explicitly not a valid value? > + * Optional > + * > + * The underlying storage is not affected by the direct IO memory > + * lifetime bug. See: > + * http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html > + * > + * So use of O_DIRECT is safe, in circumstances ^Some ? Otherwise I can't parse this sentence. > + * where we would normally have avoided it as a workaround for > + * that bug. This option is not relevant for all backends, and > + * even not necessarily supported for those for which it is > + * relevant. A backend which knows that it is not affected by > + * the bug can ignore this option. > + * > *--------------------------------- Features > --------------------------------- > * > * feature-barrier _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |