[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] Windows on Xen bad IO performance



pt., 28 wrz 2018 o 10:46 Paul Durrant <Paul.Durrant@xxxxxxxxxx> napisał(a):
[cut]
>   Thanks for the very detailed analysis!
>
>   Actually 8.2.1 are the latest signed drivers.

Retesting this again on the same testbed. Results are exactly the same
as in case of 8.2.0.

[cut]


>   I notice from your QEMU log that you are suffering grant table exhaustion. 
> See line 142 onwards. This will *severly* affect the performance so I suggest 
> you expand your grant table. You'll still see the buffer reaping, but the 
> perf. should be better.
>

I have compared gnttab_max_frames 32 and 128. Results:

== pv drivers 8.2.1, gnttab_max_frames=32 (debian 9 default, same
testbed as last tests)
Atto results: https://imgur.com/gallery/ElSwBqM
responsiveness: a tad better than 8.2.0, and the big package graph
shows this. IO saturation and dead IO graphs are still there. It's
better and by a margin more responsive than 8.2.0. Responsiveness
recovers instantly after Atto is done. Still bad, but better.
After atto is done, Xen's VNC has lost it's mouse. Keyboard works. Funny.
XENVBD|__BufferReaperThread:Reaping Buffers is there in the logs

== pv drivers 8.2.1, gnttab_max_frames=128 (same testbed as last tests)
Atto results: https://imgur.com/gallery/7x8k2RS
responsiveness: Up to atto transfer sizes of 12MB, cannot say if it's
different. IO saturation and dead IO graphs are still there.  When it
started testing 16MB read, suddenly everything got unblocked like
magic. I need to do more testing. This looks unreal.
After atto is done, mouse did not get lost :)

XENVBD|__BufferReaperThread:Reaping Buffers (2305 > 32) is there in the logs.

# xl dmesg | grep mem | head -n 1
(XEN) Command line: placeholder dom0_mem=4096M gnttab_max_frames=128

I would say that in case of Atto (that is REALLY IO heavy) there is
very marginal impact. On the other hand I see that SQL Server
workloads benefit from changing gnttab_max_frames.

Side note, what does this actually mean:
2679@1538131510.689960:xen_platform_log xen platform:
XENBUS|GnttabExpand: added references [00003a00 - 00003bff]
2679@1538131512.359271:xen_platform_log xen platform:
XENBUS|RangeSetPop: fail1 (c000009a)


[cut]
> > XENVBD|__BufferReaperThread:Reaping Buffers (966 > 32)
> >
> > Reaping buffers does not happen with the latest drivers.
> >
>
>   The fact that you are clearly seeing a lot of buffer is interesting in 
> itself. The buffer code is there to provide memory for bouncing SRBs when the 
> storage stack fails to honour the minimum 512 byte sector alignment needed by 
> the blkif protocol. These messages indicate that atto is not honouring that 
> alignment.

Maybe Atto is not, but so is MS SQL. This is visible when testing with
Atto on both 8.2.1 and 8.2.0, not visible on 9.0-dev-20180927. The
9.0-dev is getting lower results with smaller packet sizes, but stable
and working across the Atto test.

>
> > == questions:
> >
> > * so you guys must have done something in the right direction since
> > 8.2.0. BRAVO.
>
>   The master branch has a lot of re-work and the buffering code is one of the 
> places that was modified. It now uses a XENBUS_CACHE to acquire bounce 
> buffers and these caches do not reap in the same way. The cache code uses a 
> slab allocator and this simply frees slabs when all the contained objects 
> become unreferenced. The bounce objects are quite small and thus, with enough 
> alloc/free interleaving, it's probably quite likely that the cache will 
> remain hot so little slab freeing or allocation will actually be happening so 
> the bounce buffer allocation and freeing overhead will be very small.
>   Also the master branch should default to a single (or maybe 2?) page ring, 
> even if the backend can do 16 whereas all the 8.2.X drivers will use all 16 
> pages (which is why you need a heap more grant entries).
>

can this be tweaked somehow on current 8.2.X drivers? to get a single
page ring? max_ring_page_order on xen_blkback in dom0?

> > * what is the expected write and read speed on a harware that can
> > deliver (measured with dd) reads at about 77MB/s, and writes 58MB/s.
> > * do you guys plan to improve something more? How can I help to test
> > and debug it?
> > * when are you planning to have a next signed release?
>
>   All the real improvements are all in master (not even in the 
> as-yet-unsigned 8.2.2), so maybe we're nearing the point where a 9.0.0 
> release makes sense. This means we need to start doing fill logo kit runs on 
> all the drivers to shake out any weird bugs or compatibility problems, which 
> takes quite a bit of effort so I'm not sure how soon we'll get to that. 
> Hopefully within a few months though.
>   You could try setting up a logo kit yourself and try testing XENVBD to see 
> if it passes... that would be useful knowledge.

seems fun. Where can I read on how to set up the logo kit?

Is there an acceptance testplan that should be run?

Is there a list of issues that you'll want to get fixed for 9.0? Is
Citrix interested right now in getting Windows VMs of their customers
running better :)? Testing windows VMs on VMware the same way (with
VMware's paravirtual IO) is not stellar anyway, looks crap when you
compare it to virtio on KVM. And 9.0-dev I'd say would be on par with
the big competitor.

Funny story, I've tried getting virtio qemu devices running within a
XEN VM, but this is not stable enough. I have managed to get the
device show up in Windows, didn't manage to put a filesystem on it
under windows.

>
> > * how come Atto in a domU is getting better reads and writes than
> > hardware for some packet sizes? Wouldn't it be wise to disable these
> > caches and allow linux in dom0 (and it's kernel) to handle I/O of all
> > VMs?
> >
>
>   We have no caching internally in XENVBD. The use of the XENBUS_CACHE 
> objects is merely for bouncing so any real caching of data will be going on 
> in the Windows storage stack, over which we don't have much control, or in 
> your dom0 kernel.

ACK.


[cut]


--
Pozdrawiam
Jakub Kulesza

_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.