[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] block backend issues
On 14.06.2012 14:27, Marek Marczykowski wrote: > On 08.06.2012 15:11, Marek Marczykowski wrote: >> Hey, >> >> I've faced strange problem with block devices. When trying to read some file >> (from read-only ext3), everything looks good, except that file content is >> corrupted! But this can be coincidence (that "failed" reads doesn't hit >> filesystem metadata). >> fsck in dom0 on filesystem image returns no errors. >> fsck (with -nf flags) in domU on the device causes the kernel to output >> "blkfront: flush disk cache: empty write xvdd op failed", "blkfront: xvdd: >> barrier or flush: disable". And returns no filesystem errors. From that >> point, >> file reads return correct file content. For most cases dropping block cache >> (echo 3 > /proc/sys/vm/drop_caches) or remounting device also "fixes" the >> problem. >> >> On RW device (with different size, filesystem and content), domU kernel >> complains about EXT4 errors. >> Doesn't observed such strange issues on device-mapper backed devices. >> >> On 3.2.7 it worked, problem observed on 3.3.5 and 3.4 in dom0, regardless of >> domU kernel (tried 3.2.7, 3.3.5, 3.4.0). >> >> I've suspected feature-flush-cache/feature-barrier, but when disabled its >> advertise in blkback code, problem still occurs. >> >> Some details: >> dom0: 3.4.0-1.pvops.qubes.x86_64 (vanilla 3.4 + Konrad's patches for ACPI S3) >> domU: 3.3.5-1.pvops.qubes.x86_64 (vanilla 3.3.5 + Konrad's patches for ACPI >> S3) > > (...) > Still the case on 3.4.1 with applied patches from Konrad's for-jens-3.5 > branch. > I've compared file contents and it differs in (multiply of) 1024 bytes - the > same as filesystem block size. And only if block wasn't in pagecache in dom0. > When I flush VM pagecache (echo 1 > /proc/.../drop_caches) after trying to > read some files (actually md5sum -c), but not dom0 pagecache - problem > vanished. But if I clean also dom0 pagecache - problem returns. > > Any clues welcomed... Ok, found the reason. It wasn't blkback fault, even on baremetal, loopback-mounted image had the same problem. It was caused by "0fc9d104 radix-tree: use iterators in find_get_pages* functions" commit somehow between 3.3 and 3.4. It is already fixed in 3.4.2. -- Best Regards / Pozdrawiam, Marek Marczykowski Invisible Things Lab Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |