[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fatal crash on xen4.2 HVM + qemu-xen dm + NFS

Ian, Stefano,

--On 21 January 2013 16:51:13 +0000 Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:

Not as far as I know, but Trond zero-copy == O_DIRECT so if you aren't
using O_DIRECT then you aren't using zero copy -- and that agrees with
my recollection. In that case your issue is something totally unrelated.

Further investigation suggests that Stefano's commit
(attached below) may have somewhat surprising results.

Firstly, changing the cache=writeback settings as passed to the QEMU
command line probably only affects emulated disks, as the parameters
for the PV disk appear to be hard coded per this commit, assuming I've
understood correctly. I am guessing my fiddling with the cache=
setting merely caused the emulated disk (used in HVM until the kernel
has loaded) to break.

Secondly, the chosen mode of cache operation is:
This appears to be the same as "cache=none" produces (see code
fragment from bdrv_parse_cache_flags below), which is somewhat
counterintuitive given the name of the second flag. "cache=writeback"
(as appears on the command line) uses BDRV_O_CACHE_WB only.

BDRV_O_NOCACHE appears to map on Linux to O_DIRECT, and BDRV_O_CACHE_WB
to writeback caching. This implies O_DIRECT will always be used. This
is somewhat surprising as qemu by default only uses O_DIRECT with
cache=none, and yet the emulated devices are set up with the
equivalent of cache=writeback.

But this would explain why I'm still seeing the crash with O_DIRECT
apparently off (cache=writeback), as the cache setting is being ignored.

This would also explain why Ian might not have seen it (it went in
late and without O_DIRECT we think this crash can't happen).

Is the BDRV_O_NOCACHE | BDRV_O_CACHE_WB combination intentional or
should BDRV_O_NOCACHE be removed? Why would the default be different
for emulated and PV disks?

Alex Bligh

commit 47982cb00584371928e44ab6dfc6865d605a52fd
Author: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Date:   Fri Mar 23 14:36:18 2012 +0000

   xen_disk: open disks with BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO

   Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 285a951..16c3e66 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -663,10 +663,10 @@ static int blk_init(struct XenDevice *xendev)

    /* read-only ? */
    if (strcmp(blkdev->mode, "w") == 0) {
-        qflags = BDRV_O_RDWR;
+        qflags |= BDRV_O_RDWR;
    } else {
-        qflags = 0;
        info  |= VDISK_READONLY;

Except from qemu's block.c

int bdrv_parse_cache_flags(const char *mode, int *flags)
   *flags &= ~BDRV_O_CACHE_MASK;

   if (!strcmp(mode, "off") || !strcmp(mode, "none")) {
       *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
   } else if (!strcmp(mode, "directsync")) {
       *flags |= BDRV_O_NOCACHE;
   } else if (!strcmp(mode, "writeback")) {
       *flags |= BDRV_O_CACHE_WB;
   } else if (!strcmp(mode, "unsafe")) {
       *flags |= BDRV_O_CACHE_WB;
       *flags |= BDRV_O_NO_FLUSH;
   } else if (!strcmp(mode, "writethrough")) {
       /* this is the default */
   } else {
       return -1;

   return 0;

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.