[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH] xen-blk(front|back): Handle large physical sector disks
I accidentally realized today that any domU's using the paravirt disk driver potentially suffer from poor performance when they get handed in a physical volume and partitioning is done inside the guest. The physical volume passed in has to be one that has the compat 512 logical sector size but hints its real sector size (eg. 4096) as physical sector size. In dom0 handling is correct and partitions or logical volumes there would be aligned to 4k. But the blkfront driver just gets one sector size (the logical) passed in from netback. And so inside the guest the drive appears to be an old style 512/512 drive. Alignment of partitions created in the guest very likely go wrong somewhere (they do for me). I tried to fix this in a way that works with all four combinations of dom0 and domU drivers. Tested those with a PVM guest that was set up to have a logical volume passed in as xvdb). Sidenote: PVM guests that map files or volume directly to partitions may be accidentally ok as ext4 uses 4k blocks by default). What I am not sure about is whether this also is sufficient for handling migration (possible to another host with other sectors). But I think that the units of tables is still 512, only alignment is changed. So it should more or less work. How does this look to you? -Stefan From 6c200e6666cd4d632e2234d267e387b72d69a95c Mon Sep 17 00:00:00 2001 From: Stefan Bader <stefan.bader@xxxxxxxxxxxxx> Date: Mon, 13 May 2013 16:28:15 +0200 Subject: [PATCH] xen/blk: Use physical sector size for setup Currently xen-blkback passes the logical sector size over xenbus and xen-blkfront sets up the paravirt disk with that logical block size. But newer drives usually have the logical sector size set to 512 for compatibility reasons and would show the actual sector size only in physical sector size. This results in the device being partitioned and accessed in dom0 with the correct sector size, but the guest thinks 512 bytes is the correct block size. And that results in poor performance. To fix this, blkback gets modified to pass also physical-sector-size over xenbus and blkfront to use both values to set up the paravirt disk. I did not just change the passed in sector-size because I am not sure having a bigger logical sector size than the physical one is valid (and that would happen if a newer dom0 kernel hits an older domU kernel). Also this way a domU set up before should still be accessible (just some tools might detect the unaligned setup). Signed-off-by: Stefan Bader <stefan.bader@xxxxxxxxxxxxx> --- drivers/block/xen-blkback/xenbus.c | 7 +++++++ drivers/block/xen-blkfront.c | 24 ++++++++++++++++++++---- 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 8bfd1bc..1404204 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -704,6 +704,13 @@ again: dev->nodename); goto abort; } + err = xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%u", + bdev_physical_block_size(be->blkif->vbd.bdev)); + if (err) { + xenbus_dev_fatal(dev, err, "writing %s/physical-sector-size", + dev->nodename); + goto abort; + } err = xenbus_transaction_end(xbt, 0); if (err == -EAGAIN) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index d89ef86..60ec315 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -542,7 +542,8 @@ wait: flush_requests(info); } -static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) +static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, + unsigned int physical_sector_size) { struct request_queue *rq; struct blkfront_info *info = gd->private_data; @@ -564,6 +565,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) /* Hard sector size and max sectors impersonate the equiv. hardware. */ blk_queue_logical_block_size(rq, sector_size); + blk_queue_physical_block_size(rq, physical_sector_size); blk_queue_max_hw_sectors(rq, 512); /* Each segment in a request is up to an aligned page in size. */ @@ -667,7 +669,8 @@ static char *encode_disk_name(char *ptr, unsigned int n) static int xlvbd_alloc_gendisk(blkif_sector_t capacity, struct blkfront_info *info, - u16 vdisk_info, u16 sector_size) + u16 vdisk_info, u16 sector_size, + unsigned int physical_sector_size) { struct gendisk *gd; int nr_minors = 1; @@ -734,7 +737,7 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity, gd->driverfs_dev = &(info->xbdev->dev); set_capacity(gd, capacity); - if (xlvbd_init_blk_queue(gd, sector_size)) { + if (xlvbd_init_blk_queue(gd, sector_size, physical_sector_size)) { del_gendisk(gd); goto release; } @@ -1395,6 +1398,7 @@ static void blkfront_connect(struct blkfront_info *info) { unsigned long long sectors; unsigned long sector_size; + unsigned int physical_sector_size; unsigned int binfo; int err; int barrier, flush, discard, persistent; @@ -1437,6 +1441,17 @@ static void blkfront_connect(struct blkfront_info *info) return; } + /* + * physcial-sector-size is a newer field, so old backends may not + * provide this. Assume physical sector size to be the same as + * sector_size in that case. + */ + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, + "physical-sector-size", "%u", &physical_sector_size, + NULL); + if (err) + physical_sector_size = sector_size; + info->feature_flush = 0; info->flush_op = 0; @@ -1483,7 +1498,8 @@ static void blkfront_connect(struct blkfront_info *info) else info->feature_persistent = persistent; - err = xlvbd_alloc_gendisk(sectors, info, binfo, sector_size); + err = xlvbd_alloc_gendisk(sectors, info, binfo, sector_size, + physical_sector_size); if (err) { xenbus_dev_fatal(info->xbdev, err, "xlvbd_add at %s", info->xbdev->otherend); -- 1.7.9.5 Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |