[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Fatal crash on xen4.2 HVM + qemu-xen dm + NFS
On Wed, 23 Jan 2013, Alex Bligh wrote: > Ian, Stefano, > > First bit of good news: Avoiding using O_DIRECT in xen_disk.c > fixes the problem (see patch below for what we tested). > > Therefore I think this problem will potentially occur: > > * Using qemu-xen or qemu-traditional (both of which use O_DIRECT) > > * With anything that uses a grant of another domX's memory (I > suspect that includes PV guests as well as HVM). > > * Whenever the dom0 uses TCP to the backing device. That includes > NFS and iSCSI. > > Disabling O_DIRECT is therefore a workaround, but I suspect not > an attractive workaround performance-wise. Fixing it properly > would appear to require Ian C's skbuff page tracking stuff. > > >> I suspect the best possible answer is a > >> cache=[qemu-cache-identifier] > >> config key, which gets put in xenstore, and which xen_disk.c then > >> interprets using the same routine QEMU does itself for cache= on the > >> command line, then uses exactly those BDEV flags. > > > > that would be nice to have > > Good > > >> For completeness one could also add an emulcache= option and just > >> pass that straight through to the qemu command line for the emulated > >> drives. > > > > In qemu-xen it should be already possible to change the cache setting > > for the IDE disk by passing the right command line option to QEMU. Not > > ideal, but it would work. > > It would indeed, but 'cache=writeback' is currently hard coded. I was > planning to change that to 'cache=%s' and pass the value of > emulcache=. > > >> I had a quick look at this on the train and it appears that to do it > >> properly requires fiddling with lex file and possibly xenstore, i.e. > >> is not completely trivial. > >> > >> An alternative more disgusting arrangement would be to overload one > >> of the existing options. > >> > >> What's the right approach here, and have I any hope of getting a patch > >> into 4.2.2? Not a disaster if not as I already need to maintain a local > >> tree due to the absence of HVM live migrate on qemu-upstream. > > > > Backports are only for bug fixes. > > Sure - and it would seem to me dom0 crashing horribly is a bug! I > have a reliable way to replicate it, but it can happen to others. > > In terms of a local fix, a one line change to xen_disk is all I need. > > What I'm asking is what is the right way to fix this bug in 4.2. > At one extreme, I can probably cook up a 'let's do it properly' patch > which takes the proposed cache= option (most useful, most intrusive). > At the other extreme, we could just apply the patch below or a variant > of it conditional on a configure flag (least useful, least intrusive). > > > However it might be possible to > > backport a simple patch that in case of opening failure with O_DIRECT, > > tries again without it (see appended). I don't think that would work > > for you though, unless you manage to mount NFS in a way that would > > return EINVAL on open(O_DIRECT), like it used to. > > I don't think that will work for any of the problem cases I've identified > above as NFS always permits O_DIRECT opens (I'm not even sure whether > CONFIG_NFS_DIRECTIO still exists), and so does iSCSI. > > -- > Alex Bligh > > diff --git a/hw/xen_disk.c b/hw/xen_disk.c > index a402ac8..1c3a6f5 100644 > --- a/hw/xen_disk.c > +++ b/hw/xen_disk.c > @@ -603,7 +603,7 @@ static int blk_init(struct XenDevice *xendev) > } > > /* read-only ? */ > - qflags = BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NATIVE_AIO; > + qflags = /* BDRV_O_NOCACHE | */ BDRV_O_CACHE_WB | BDRV_O_NATIVE_AIO; > if (strcmp(blkdev->mode, "w") == 0) { > qflags |= BDRV_O_RDWR; > } else { Before going for something like that I would like a confirmation from Konrad about blkfront behavior regarding barriers and BLKIF_OP_FLUSH_DISKCACHE. I certainly don't want to risk data corruptions. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |