[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.



Il 18/03/2013 15:30, Alex Bligh ha scritto:
> Paolo,
> 
> --On 18 March 2013 15:05:08 +0100 Paolo Bonzini <pbonzini@xxxxxxxxxx>
> wrote:
> 
>>> Presumably the same way as if writeback caching is selected. I presume
>>> that must fsync() / fdatasync() all the data to disk, and a barrier will
>>> produce one of those.
>>
>> No, that's done already.  The source does an fsync/fdatasync before
>> terminating the migration.
>>
>> The problem is that the target's page cache might host image data from a
>> previous run.  If you do not use O_DIRECT, it will not see the changes
>> made on the source.
> 
> I was under the impression that with cache=writeback, qemu doesn't
> use O_DIRECT, in which case why isn't there the same danger under
> kvm, i.e. that the target page cache contains data from a previous
> run?

KVM in fact only supports migration using cache=none.  This does not
apply of course if you're using cache-coherent storage, such as rbd or
gluster; or if you're using one of the userspace backends that bypass
the page cache, like NBD or libiscsi.

> Disabling migration seems a bit excessive when migration isn't disabled
> with cache=unsafe (AFAIK)

It is not QEMU's task.  There are cases where the cache= options are
unnecessary or ignored.  But indeed libvirt warns (or blocks, I don't
remember) in this case.

> , and the alternative (using O_DIRECT)
> is far far more unsafe (one tcp retransmit and your system is dead).
> 
>> 1) why does blkback not have the bug?
>>
>> 2) does it also affect virtio disks (or perhaps AHCI too)?  I think
>> Stefano experimented with virtio in Xen.  If it does, then you're
>> working around the problem in the wrong place.
> 
> I believe it affects PV disks and not emulated disks as emulated disks
> under Xen do not use O_DIRECT (despite migration apparently working
> notwithstanding your comment above).

If libxl does migration without O_DIRECT, then that's a bug in libxl.
What about blkback?  IIRC it uses bios, so it also bypasses the page cache.

> Stefano did ack the patch, and for a one line change it's been
> through a pretty extensive discussion on xen-devel ...

It may be a one-line change, but it completely changes the paths that
I/O goes through.  Apparently the discussion was not enough.

> I've no idea what else it affects. I'd suggest it also affects kvm,
> save that the kvm 'bad' will be writing the wrong data, not hosing
> the whole machine.
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.