[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Windows on Xen bad IO performance



OK, so I did some more tests.

The testbed:
* dom0 Debian Stretch with Xen 4.8.4
* 4 core 2,66GHz, 20 GB ram
* 4 spinny disks raid5 on hardware controller, dd tested reads about
77MB/s, writes 58MB/s
* domU windows2016, domU config with qemu logging enabled:
https://pastebin.com/g8ddMVbV
* gnttab_max_frames left at default

Test procedure:
* install windows 2016
* bcdedit /set testsigning on
* reboot (and create a snapshot, drivers installed on snapshot version
of windows)
* install pv drivers
* reboot
* get Atto 3.05
* Atto all on default, except testing drive "d" (plain LVM, no
snapshot) and setting queue length to 10.

* qemu log from install to running the atto below (drivers installed:
the latest): https://pastebin.com/C1TasWtn

I think that Atto is quite a good indicator of how a heavy used server
will behave, as we have the same symptoms on another host with windows
2016 on a domU with heavy used MSSQL database.

== testing the latest drivers as of 2018-09-27 from
http://xenbits.xen.org/pvdrivers/win/

Atto test run in qemu log: https://pastebin.com/saq3N6PH
screenshot: https://imgur.com/gallery/ouTQo7b
The test takes a few minutes

What is wrong:
* notice the flat areas on the HDD graphs? This is when the system
behaves unresponsive. It recovers, quite quickly, but the problem is
there.
* Read and Writes should not fall so low on 128KB packets. 128KB
should be at the level of 16, 32 and 64KB and continue onwards on the
same level.

What is better from earlier experiments
* the latest drivers do not make the system go nuts for minutes after
atto is finished, but it kinda is useable during the test.

== testing pv drivers 8.2.0 (latest signed)

For this I did create another snapshot of the system, so I can install
the drivers on a fresh windows, that had no previous version of the
drivers.

Atto test run in qemu log: https://pastebin.com/9PauBcUK
screenshot with results: https://imgur.com/gallery/HC2aSiW
the test takes about an hour (!) and some 20-30 minutes to settle down.

What is wrong:
* system responsiveness in way worse than with the latest ones,
unusable. SQL server would refuse to serve queries with such IO waits.

What is different in the qemu logs is this:

27388@1538082446.673267:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (8346 > 32)
27388@1538082447.752598:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (1061 > 32)
27388@1538082449.768223:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (1700 > 32)
27388@1538082462.879887:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (2898 > 32)
27388@1538082464.009918:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (5157 > 32)
27388@1538082465.066077:xen_platform_log xen platform:
XENVBD|__BufferReaperThread:Reaping Buffers (966 > 32)

Reaping buffers does not happen with the latest drivers.

== questions:

* so you guys must have done something in the right direction since
8.2.0. BRAVO.
* what is the expected write and read speed on a harware that can
deliver (measured with dd) reads at about 77MB/s, and writes 58MB/s.
* do you guys plan to improve something more? How can I help to test
and debug it?
* when are you planning to have a next signed release?
* how come Atto in a domU is getting better reads and writes than
hardware for some packet sizes? Wouldn't it be wise to disable these
caches and allow linux in dom0 (and it's kernel) to handle I/O of all
VMs?


Best regards, Jakub Kulesza

wt., 31 lip 2018 o 11:44 Paul Durrant <Paul.Durrant@xxxxxxxxxx> napisał(a):
>
> > -----Original Message-----
> > From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On
> > Behalf Of Jakub Kulesza
> > Sent: 31 July 2018 10:02
> > To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > Subject: Re: [win-pv-devel] Windows on Xen bad IO performance
> >
> > 2018-07-31 9:51 GMT+02:00 Paul Durrant <Paul.Durrant@xxxxxxxxxx>:
> > >
> > > De-htmling... Responses below...
> > >
> > > -----
> > > From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx]
> > On Behalf Of Jakub Kulesza
> > > Sent: 30 July 2018 16:08
> > > To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > > Subject: [win-pv-devel] Windows on Xen bad IO performance
> > >
> > > I have a number of different hosts with different xen and windows
> > versions, but they all share the same thing. Each time I install xen 
> > windows pv
> > drivers 8.2.0 from here: https://www.xenproject.org/developer...v-
> > drivers.html I'm getting worse IO performance than before, on standard
> > Windows drivers.
> > >
> > [cut]
> > >
> > > I found out that I need to modify the gnttab_max_frames parameter to
> > the xen hypervisor at boottime. A lot of links and reading starts here:
> > https://wiki.gentoo.org/wiki/Xen#Xen..._kernel_4.3.2B
> > >
> > > I did some testing and I am very confused right now. The
> > gnttab_max_frames is by default 32 (increased to 64 in some xen version),
> > and to solve the issues i would need to set it higher to 256. The results I 
> > get
> > seem to show something totally different.
> > >
> > > New test rig:
> > > • ubuntu 18.04 LTS with everything from normal repositories, updated, xen
> > 4.9
> > > • i5-8500, 16GB ram, Samsung 850 evo SSD,
> > > • windows 2016 installed on a LVM volume,
> > > • xen pv drivers 8.2.0 installed on Windows,
> > > • logged to the VM using VNC from a laptop in the same local network.
> > >
> > > I've tested this at a number of values of gnttab_max_frames from 4 to
> > 4096.
> > >
> > > Passmark provides consistent results at around 510 MB/s READ, 305 MB/s
> > WRITE, 330 MB/s Random ReadWrite, regardless of the setting of
> > gnttab_max_frames. I guess that it does not saturate the grant tables
> > mechanism of XEN that much. But with ATTO, the situation is sooo different.
> > > • gnttab_max_frames = 4
> > > o Windows is very snappy, responsive, even under heavy load from ATTO.
> > > o Atto shows good results, with some signs of saturation with packets
> > bigger than 512KB.
> > > • gnttab_max_frames = 10
> > > o Windows is very snappy but stops being responsive, even under heavy
> > load from ATTO.
> > > o Atto shows mediocre results, saturation is very high with packets bigger
> > than 512KB.
> > > • gnttab_max_frames = 64
> > > o You can feel that the windows windows open a little bit slower, system
> > feels dead with high load from ATTO.
> > > o Atto shows bad results, saturation kills the system with packets bigger
> > than 512KB. System is getting back OK after ATTO finishes.
> > > • gnttab_max_frames = 256
> > > o Even worse than 64, the results show similarity to 64, but the system 
> > > just
> > did not react. I fed up with waiting.
> > > • gnttab_max_frames = 4096
> > > o Windows did not boot. I just got fed up with waiting.
> > [cut]
> >
> > >
> > > As discussed on IRC, it would be useful if you tried the 8.2.2 drivers 
> > > and also
> > highly useful if you could capture logging from QEMU.
> > >
> > > One other thing that occurs to me is that XENVBD implements indirect
> > granting but this is relatively under tested because the only backend that
> > implements it is blkback, and we don't use that in XenServer. Whilst is may
> > be slower overall, you might get more stability using QEMU qdisk. (We have a
> > couple of performance fixes for this in the pipeline in Citrix as we are now
> > starting to use it as our default backend, but it should be reasonable 
> > as-is).
> > >
> > >   Paul
> >
> > I did test 8.2.2 PV drivers. Did not managed to get QEMU logging thou.
> > Will read more and retry.
> >
> > Results on the i5-8500 rig - everything set the same as in the tests
> > mentioned above:
> >
> > https://imgur.com/gallery/PTm5f4G
> >
> > gnttab_max_frames = 4:
> > no signs or very little signs of saturation, everything is flying,
> > scores are better than with 8.2.0
> >
> > gnttab_max_frames = default for ubuntu 18.04 (so 32 or 64)
> > saturation, system goes unresponsive, as bad as before
> >
> > gnttab_max_frames = 256
> > saturation, system goes unresponsive, as bad as before
> >
> > Passmark shows better results on all gnttab_max_frames settings:
> > Read: 514-515 (same as 8.2.0)
> > Write: 477 (better!)
> > Random ReadWrite: 300-360 (same as 8.2.0)
> >
> > Is this behaviour (lowering max frames to get better results) working
> > as expected?
> >
> > How low should I NOT go with max_frames?
>
> In general you should not be lowering it from the default. The only thing 
> that will achieve is starving the guest frontend of grants. If it has having 
> a positive impact then that indicates a problem with the frontend.
>
> >
> > Does XenServer recommend any windows guest drivers if used with qemu
> > backend?
> >
>
> XenServer is basically using 8.2.1 plus some branding and workaround patches. 
> We're likely to move to an 8.2.2 XENVBD though.
>
>   Paul
>
> >
> > --
> > Pozdrawiam
> > Jakub Kulesza
> >
> > _______________________________________________
> > win-pv-devel mailing list
> > win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > https://lists.xenproject.org/mailman/listinfo/win-pv-devel



-- 
Pozdrawiam
Jakub Kulesza

_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.