[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature-persistent if grant copy is not available
On Wed, Jun 21, 2017 at 11:40:00AM +0100, Paul Durrant wrote: > > -----Original Message----- > > From: Qemu-devel [mailto:qemu-devel- > > bounces+paul.durrant=citrix.com@xxxxxxxxxx] On Behalf Of Paul Durrant > > Sent: 21 June 2017 10:36 > > To: Roger Pau Monne <roger.pau@xxxxxxxxxx>; Stefano Stabellini > > <sstabellini@xxxxxxxxxx> > > Cc: Kevin Wolf <kwolf@xxxxxxxxxx>; qemu-block@xxxxxxxxxx; qemu- > > devel@xxxxxxxxxx; Max Reitz <mreitz@xxxxxxxxxx>; Anthony Perard > > <anthony.perard@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx > > Subject: Re: [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature- > > persistent if grant copy is not available > > > > > -----Original Message----- > > > From: Roger Pau Monne > > > Sent: 21 June 2017 10:18 > > > To: Stefano Stabellini <sstabellini@xxxxxxxxxx> > > > Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; xen- > > devel@xxxxxxxxxxxxxxxxxxxx; > > > qemu-devel@xxxxxxxxxx; qemu-block@xxxxxxxxxx; Anthony Perard > > > <anthony.perard@xxxxxxxxxx>; Kevin Wolf <kwolf@xxxxxxxxxx>; Max > > Reitz > > > <mreitz@xxxxxxxxxx> > > > Subject: Re: [PATCH 1/3] xen-disk: only advertize feature-persistent if > > grant > > > copy is not available > > > > > > On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote: > > > > On Tue, 20 Jun 2017, Paul Durrant wrote: > > > > > If grant copy is available then it will always be used in preference > > > > > to > > > > > persistent maps. In this case feature-persistent should not be > > advertized > > > > > to the frontend, otherwise it may needlessly copy data into > > > > > persistently > > > > > granted buffers. > > > > > > > > > > Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > > > > > > > > CC'ing Roger. > > > > > > > > It is true that using feature-persistent together with grant copies is a > > > > a very bad idea. > > > > > > > > But this change enstablishes an explicit preference of > > > > feature_grant_copy over feature-persistent in the xen_disk backend. It > > > > is not obvious to me that it should be the case. > > > > > > > > Why is feature_grant_copy (without feature-persistent) better than > > > > feature-persistent (without feature_grant_copy)? Shouldn't we simply > > > > avoid grant copies to copy data to persistent grants? > > > > > > When using persistent grants the frontend must always copy data from > > > the buffer to the persistent grant, there's no way to avoid this. > > > > > > Using grant_copy we move the copy from the frontend to the backend, > > > which means the CPU time of the copy is accounted to the backend. This > > > is not ideal, but IMHO it's better than persistent grants because it > > > avoids keeping a pool of mapped grants that consume memory and make > > > the code more complex. > > > > > > Do you have some performance data showing the difference between > > > persistent grants vs grant copy? > > > > > > > No, but I can get some :-) > > > > For a little background... I've been trying to push throughput of fio > > running in > > a debian stretch guest on my skull canyon NUC. When I started out, I was > > getting ~100MBbs. When I finished, with this patch, the IOThreads one, the > > multi-page ring one and a bit of hackery to turn off all the aio flushes > > that > > seem to occur even if the image is opened with O_DIRECT, I was getting > > ~960Mbps... which is about line rate for the SSD in the in NUC. > > > > So, I'll force use of persistent grants on and see what sort of throughput I > > get. > > A quick test with grant copy forced off (causing persistent grants to be > used)... My VM is debian stretch using a 16 page shared ring from blkfront. > The image backing xvdb is a fully inflated 10G qcow2. > > root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 > --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 > --size=10G --readwrite=randwrite --ramp_time=4 > test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, > iodepth=64 > fio-2.16 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/539.4MB/0KB /s] [0/1078/0 iops] [eta > 00m:05s] > test: (groupid=0, jobs=1): err= 0: pid=633: Wed Jun 21 06:26:06 2017 > write: io=6146.6MB, bw=795905KB/s, iops=1546, runt= 7908msec > cpu : usr=2.07%, sys=34.00%, ctx=4490, majf=0, minf=1 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=166.9% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, > >=64=0.0% > issued : total=r=0/w=12230/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > Run status group 0 (all jobs): > WRITE: io=6146.6MB, aggrb=795904KB/s, minb=795904KB/s, maxb=795904KB/s, > mint=7908msec, maxt=7908msec > > Disk stats (read/write): > xvdb: ios=54/228860, merge=0/2230616, ticks=16/5403048, in_queue=5409068, > util=98.26% > > The dom0 cpu usage for the relevant IOThread was ~60% > > The same test with grant copy... > > root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 > --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 > --size=10G --readwrite=randwrite --ramp_time=4 > test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, > iodepth=64 > fio-2.16 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/607.7MB/0KB /s] [0/1215/0 iops] [eta > 00m:05s] > test: (groupid=0, jobs=1): err= 0: pid=483: Wed Jun 21 06:35:14 2017 > write: io=6232.0MB, bw=810976KB/s, iops=1575, runt= 7869msec > cpu : usr=2.44%, sys=37.42%, ctx=3570, majf=0, minf=1 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=164.6% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, > >=64=0.0% > issued : total=r=0/w=12401/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > Run status group 0 (all jobs): > WRITE: io=6232.0MB, aggrb=810975KB/s, minb=810975KB/s, maxb=810975KB/s, > mint=7869msec, maxt=7869msec > > Disk stats (read/write): > xvdb: ios=54/229583, merge=0/2235879, ticks=16/5409500, in_queue=5415080, > util=98.27% > > So, higher throughput and iops. The dom0 cpu usage was running at ~70%, so > there is definitely more dom0 overhead by using grant copy. The usage of > grant copy could probably be improved through since the current code issues > an copy ioctl per ioreq. With some batching I suspect some, if not all, of > the extra overhead could be recovered. There's almost always going to be more CPU overhead with grant-copy, since when using persistent grants QEMU can avoid all (or almost all) of the ioctls to the grant device. For the persistent-grants benchmark, did you warm up the grant cache first? (ie: are those results from a first run of fio?) In any case, I'm happy to use something different than persistent grants as long as the performance is similar. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |