[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Windows on Xen bad IO performance



I'll do a shorter summary, as my English was not precise enough last
night I guess:

* 8.2.0 drivers:
  - Atto test took over an hour to complete
  - the VM lost keyboard and responsiveness after Atto finished
  - VM was unresponsive during the test, the results were bad
  - configs, logs, results given below
  - you could see the following in logs
XENVBD|__BufferReaperThread:Reaping Buffers (1700 > 32), absent in the
test of the latest drivers
* latest 2018-09-27 drivers
  - Atto test took a few minutes to complete
  - VM was responsive. It had it's hickups, was not as good as a bare
metal system, but I guess that right now this is on par with VMware.
KVM with virtio behaves better.
  - configs, logs, results given below
  - results were much better than non-PV-drivers version. Atto did
show that something saturates in the pipeline, but not very heavily.

So how can I help to get the latest patches released as a new version,
signed? Or is there something more on the road planned in terms of
code changes? I can help testing definitely.

Does it still make sense to change gnttab_max_frames with the latest
changes in the PV drivers?

pt., 28 wrz 2018 o 00:07 Jakub Kulesza <jakkul@xxxxxxxxx> napisał(a):
>
> OK, so I did some more tests.
>
> The testbed:
> * dom0 Debian Stretch with Xen 4.8.4
> * 4 core 2,66GHz, 20 GB ram
> * 4 spinny disks raid5 on hardware controller, dd tested reads about
> 77MB/s, writes 58MB/s
> * domU windows2016, domU config with qemu logging enabled:
> https://pastebin.com/g8ddMVbV
> * gnttab_max_frames left at default
>
> Test procedure:
> * install windows 2016
> * bcdedit /set testsigning on
> * reboot (and create a snapshot, drivers installed on snapshot version
> of windows)
> * install pv drivers
> * reboot
> * get Atto 3.05
> * Atto all on default, except testing drive "d" (plain LVM, no
> snapshot) and setting queue length to 10.
>
> * qemu log from install to running the atto below (drivers installed:
> the latest): https://pastebin.com/C1TasWtn
>
> I think that Atto is quite a good indicator of how a heavy used server
> will behave, as we have the same symptoms on another host with windows
> 2016 on a domU with heavy used MSSQL database.
>
> == testing the latest drivers as of 2018-09-27 from
> http://xenbits.xen.org/pvdrivers/win/
>
> Atto test run in qemu log: https://pastebin.com/saq3N6PH
> screenshot: https://imgur.com/gallery/ouTQo7b
> The test takes a few minutes
>
> What is wrong:
> * notice the flat areas on the HDD graphs? This is when the system
> behaves unresponsive. It recovers, quite quickly, but the problem is
> there.
> * Read and Writes should not fall so low on 128KB packets. 128KB
> should be at the level of 16, 32 and 64KB and continue onwards on the
> same level.
>
> What is better from earlier experiments
> * the latest drivers do not make the system go nuts for minutes after
> atto is finished, but it kinda is useable during the test.
>
> == testing pv drivers 8.2.0 (latest signed)
>
> For this I did create another snapshot of the system, so I can install
> the drivers on a fresh windows, that had no previous version of the
> drivers.
>
> Atto test run in qemu log: https://pastebin.com/9PauBcUK
> screenshot with results: https://imgur.com/gallery/HC2aSiW
> the test takes about an hour (!) and some 20-30 minutes to settle down.
>
> What is wrong:
> * system responsiveness in way worse than with the latest ones,
> unusable. SQL server would refuse to serve queries with such IO waits.
>
> What is different in the qemu logs is this:
>
> 27388@1538082446.673267:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (8346 > 32)
> 27388@1538082447.752598:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (1061 > 32)
> 27388@1538082449.768223:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (1700 > 32)
> 27388@1538082462.879887:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (2898 > 32)
> 27388@1538082464.009918:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (5157 > 32)
> 27388@1538082465.066077:xen_platform_log xen platform:
> XENVBD|__BufferReaperThread:Reaping Buffers (966 > 32)
>
> Reaping buffers does not happen with the latest drivers.
>
> == questions:
>
> * so you guys must have done something in the right direction since
> 8.2.0. BRAVO.
> * what is the expected write and read speed on a harware that can
> deliver (measured with dd) reads at about 77MB/s, and writes 58MB/s.
> * do you guys plan to improve something more? How can I help to test
> and debug it?
> * when are you planning to have a next signed release?
> * how come Atto in a domU is getting better reads and writes than
> hardware for some packet sizes? Wouldn't it be wise to disable these
> caches and allow linux in dom0 (and it's kernel) to handle I/O of all
> VMs?
>
>
> Best regards, Jakub Kulesza
>
> wt., 31 lip 2018 o 11:44 Paul Durrant <Paul.Durrant@xxxxxxxxxx> napisał(a):
> >
> > > -----Original Message-----
> > > From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On
> > > Behalf Of Jakub Kulesza
> > > Sent: 31 July 2018 10:02
> > > To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > > Subject: Re: [win-pv-devel] Windows on Xen bad IO performance
> > >
> > > 2018-07-31 9:51 GMT+02:00 Paul Durrant <Paul.Durrant@xxxxxxxxxx>:
> > > >
> > > > De-htmling... Responses below...
> > > >
> > > > -----
> > > > From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx]
> > > On Behalf Of Jakub Kulesza
> > > > Sent: 30 July 2018 16:08
> > > > To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > > > Subject: [win-pv-devel] Windows on Xen bad IO performance
> > > >
> > > > I have a number of different hosts with different xen and windows
> > > versions, but they all share the same thing. Each time I install xen 
> > > windows pv
> > > drivers 8.2.0 from here: https://www.xenproject.org/developer...v-
> > > drivers.html I'm getting worse IO performance than before, on standard
> > > Windows drivers.
> > > >
> > > [cut]
> > > >
> > > > I found out that I need to modify the gnttab_max_frames parameter to
> > > the xen hypervisor at boottime. A lot of links and reading starts here:
> > > https://wiki.gentoo.org/wiki/Xen#Xen..._kernel_4.3.2B
> > > >
> > > > I did some testing and I am very confused right now. The
> > > gnttab_max_frames is by default 32 (increased to 64 in some xen version),
> > > and to solve the issues i would need to set it higher to 256. The results 
> > > I get
> > > seem to show something totally different.
> > > >
> > > > New test rig:
> > > > • ubuntu 18.04 LTS with everything from normal repositories, updated, 
> > > > xen
> > > 4.9
> > > > • i5-8500, 16GB ram, Samsung 850 evo SSD,
> > > > • windows 2016 installed on a LVM volume,
> > > > • xen pv drivers 8.2.0 installed on Windows,
> > > > • logged to the VM using VNC from a laptop in the same local network.
> > > >
> > > > I've tested this at a number of values of gnttab_max_frames from 4 to
> > > 4096.
> > > >
> > > > Passmark provides consistent results at around 510 MB/s READ, 305 MB/s
> > > WRITE, 330 MB/s Random ReadWrite, regardless of the setting of
> > > gnttab_max_frames. I guess that it does not saturate the grant tables
> > > mechanism of XEN that much. But with ATTO, the situation is sooo 
> > > different.
> > > > • gnttab_max_frames = 4
> > > > o Windows is very snappy, responsive, even under heavy load from ATTO.
> > > > o Atto shows good results, with some signs of saturation with packets
> > > bigger than 512KB.
> > > > • gnttab_max_frames = 10
> > > > o Windows is very snappy but stops being responsive, even under heavy
> > > load from ATTO.
> > > > o Atto shows mediocre results, saturation is very high with packets 
> > > > bigger
> > > than 512KB.
> > > > • gnttab_max_frames = 64
> > > > o You can feel that the windows windows open a little bit slower, system
> > > feels dead with high load from ATTO.
> > > > o Atto shows bad results, saturation kills the system with packets 
> > > > bigger
> > > than 512KB. System is getting back OK after ATTO finishes.
> > > > • gnttab_max_frames = 256
> > > > o Even worse than 64, the results show similarity to 64, but the system 
> > > > just
> > > did not react. I fed up with waiting.
> > > > • gnttab_max_frames = 4096
> > > > o Windows did not boot. I just got fed up with waiting.
> > > [cut]
> > >
> > > >
> > > > As discussed on IRC, it would be useful if you tried the 8.2.2 drivers 
> > > > and also
> > > highly useful if you could capture logging from QEMU.
> > > >
> > > > One other thing that occurs to me is that XENVBD implements indirect
> > > granting but this is relatively under tested because the only backend that
> > > implements it is blkback, and we don't use that in XenServer. Whilst is 
> > > may
> > > be slower overall, you might get more stability using QEMU qdisk. (We 
> > > have a
> > > couple of performance fixes for this in the pipeline in Citrix as we are 
> > > now
> > > starting to use it as our default backend, but it should be reasonable 
> > > as-is).
> > > >
> > > >   Paul
> > >
> > > I did test 8.2.2 PV drivers. Did not managed to get QEMU logging thou.
> > > Will read more and retry.
> > >
> > > Results on the i5-8500 rig - everything set the same as in the tests
> > > mentioned above:
> > >
> > > https://imgur.com/gallery/PTm5f4G
> > >
> > > gnttab_max_frames = 4:
> > > no signs or very little signs of saturation, everything is flying,
> > > scores are better than with 8.2.0
> > >
> > > gnttab_max_frames = default for ubuntu 18.04 (so 32 or 64)
> > > saturation, system goes unresponsive, as bad as before
> > >
> > > gnttab_max_frames = 256
> > > saturation, system goes unresponsive, as bad as before
> > >
> > > Passmark shows better results on all gnttab_max_frames settings:
> > > Read: 514-515 (same as 8.2.0)
> > > Write: 477 (better!)
> > > Random ReadWrite: 300-360 (same as 8.2.0)
> > >
> > > Is this behaviour (lowering max frames to get better results) working
> > > as expected?
> > >
> > > How low should I NOT go with max_frames?
> >
> > In general you should not be lowering it from the default. The only thing 
> > that will achieve is starving the guest frontend of grants. If it has 
> > having a positive impact then that indicates a problem with the frontend.
> >
> > >
> > > Does XenServer recommend any windows guest drivers if used with qemu
> > > backend?
> > >
> >
> > XenServer is basically using 8.2.1 plus some branding and workaround 
> > patches. We're likely to move to an 8.2.2 XENVBD though.
> >
> >   Paul
> >
> > >
> > > --
> > > Pozdrawiam
> > > Jakub Kulesza
> > >
> > > _______________________________________________
> > > win-pv-devel mailing list
> > > win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> > > https://lists.xenproject.org/mailman/listinfo/win-pv-devel
>
>
>
> --
> Pozdrawiam
> Jakub Kulesza



-- 
Pozdrawiam
Jakub Kulesza

_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.