Xen project Mailing List

Re: [Xen-devel] LVM Checksum error when using persistent grants (#linux-next + stable/for-jens-3.8)

To: Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 7 Dec 2012 18:05:39 +0100

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Delivery-date: Fri, 07 Dec 2012 17:05:55 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 07/12/12 15:22, Konrad Rzeszutek Wilk wrote: > On Wed, Dec 05, 2012 at 10:14:55PM -0500, Konrad Rzeszutek Wilk wrote: >> Hey Roger, >> >> I am seeing this weird behavior when using #linux-next + stable/for-jens-3.8 >> tree. > > To make it easier I just used v3.7-rc8 and merged stable/for-jens-3.8 > tree. > >> >> Basically I can do 'pvscan' on xvd* disk and quite often I get checksum >> errors: >> >> # pvscan /dev/xvdf >> PV /dev/xvdf2 VG VolGroup00 lvm2 [18.88 GiB / 0 free] >> PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] >> Total: 5 [962.38 GiB] / in use: 5 [962.38 GiB] / in no VG: 0 [0 ] >> # pvscan /dev/xvdf >> /dev/xvdf2: Checksum error >> Couldn't read volume group metadata. >> /dev/xvdf2: Checksum error >> Couldn't read volume group metadata. >> PV /dev/dm-14 VG vg_x86_64-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-12 VG vg_i386-pvhvm lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/dm-11 VG vg_i386 lvm2 [4.00 GiB / 68.00 MiB free] >> PV /dev/sda VG guests lvm2 [931.51 GiB / 220.51 GiB free] >> Total: 4 [943.50 GiB] / in use: 4 [943.50 GiB] / in no VG: 0 [0 ] >> >> This is with a i386 dom0, 64-bit Xen 4.1.3 hypervisor, and with either >> 64-bit or 32-bit PV or PVHVM guest. > > And it does not matter if dom0 is 64-bit. >> >> Have you seen something like this? > > More interestingly is that the failure is the frontend. I ran the "new" > guests that do persistent grants with the old backends (so v3.7-rc8 > virgin) and still got the same failure. > >> >> Note, the other LV disks are over iSCSI and are working fine. I've found the problem, this happens when you copy only a part of the shared data in blkif_completion, this is an example of the problem: 1st loop in rq_for_each_segment * bv_offset: 3584 * bv_len: 512 * offset += bv_len * i: 0 2nd loop: * bv_offset: 0 * bv_len: 512 * i: 0 As you can see, in the second loop i is still 0 (because offset is only 512, so 512 >> PAGE_SHIFT is 0) when it should be 1. This problem made me realize another corner case, which I don't know if can happen, AFAIK I've never seen it: 1st loop in rq_for_each_segment * bv_offset: 1024 * bv_len: 512 * offset += len * i: 0 2nd loop: * bv_offset: 0 * bv_len: 512 * i: 0 In this second case, should i be 1? Can this really happen? I can't see anyway to get a "global offset" or something similar, that's not realtive to the bvec being handled right now. For the problem that you described a quick fix follows, but it doesn't cover the second case exposed above: --- diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index df21b05..6e155d0 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -869,7 +871,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info, bvec->bv_len); bvec_kunmap_irq(bvec_data, &flags); kunmap_atomic(shared_data); - offset += bvec->bv_len; + offset = (i * PAGE_SIZE) + (bvec->bv_offset + bvec->bv_len); } } /* Add the persistent grant into the list of free grants */ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.