[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Interesting observation with network event notification and batching

To: Wei Liu <wei.liu2@xxxxxxxxxx>
From: Andrew Bennieston <andrew.bennieston@xxxxxxxxxx>
Date: Mon, 17 Jun 2013 11:56:09 +0100
Cc: annie.li@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, stefano.stabellini@xxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Mon, 17 Jun 2013 10:56:32 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 17/06/13 11:46, Wei Liu wrote:

On Mon, Jun 17, 2013 at 10:56:12AM +0100, Andrew Bennieston wrote:

On 17/06/13 10:38, Ian Campbell wrote:

On Sun, 2013-06-16 at 10:54 +0100, Wei Liu wrote:

Konrad, IIRC you once mentioned you discovered something with event
notification, what's that?


They were bizzare. I naively expected some form of # of physical NIC
interrupts to be around the same as the VIF or less. And I figured
that the amount of interrupts would be constant irregardless of the
size of the packets. In other words #packets == #interrupts.


It could be that the frontend notifies the backend for every packet it
sends. This is not desirable and I don't expect the ring to behave that
way.


I have observed this kind of behaviour during network performance
tests in which I periodically checked the ring state during an iperf
session. It looked to me like the frontend was sending notifications
far too often, but that the backend was sending them very
infrequently, so the Tx (from guest) ring was mostly empty and the
Rx (to guest) ring was mostly full. This has the effect of both
front and backend having to block occasionally waiting for the other
end to clear or fill a ring, even though there is more data
available.

My initial theory was that this was caused in part by the shared
event channel, however I expect that Wei is testing on top of a
kernel with his split event channel features?


Yes, with split event channels.

And during tests the interrupt counts, frontend TX has 6 figures
interrupt number while frontend RX has 2 figures number.


It is probably worth checking that things are working how we think they
should. i.e. that netback's calls to RING_FINAL_CHECK_FOR_.. and
netfront's calls to RING_PUSH_..._AND_CHECK_NOTIFY are placed at
suitable points to maximise batching.

Is the RING_FINAL_CHECK_FOR_REQUESTS inside the xen_netbk_tx_build_gops
loop right? This would push the req_event pointer to just after the last
request, meaning the net request enqueued by the frontend would cause a
notification -- even though the backend is actually still continuing to
process requests and would have picked up that packet without further
notification. n this case there is a fair bit of work left in the
backend for this iteration i.e. plenty of opportunity for the frontend
to queue more requests.

The comments in ring.h say:
  *  These macros will set the req_event/rsp_event field to trigger a
  *  notification on the very next message that is enqueued. If you want to
  *  create batches of work (i.e., only receive a notification after several
  *  messages have been enqueued) then you will need to create a customised
  *  version of the FINAL_CHECK macro in your own code, which sets the event
  *  field appropriately.

Perhaps we want to just use RING_HAS_UNCONSUMED_REQUESTS in that loop
(and other similar loops) and add a FINAL check at the very end?

But it was odd and I didn't go deeper in it to figure out what
was happening. And also to figure out if for the VIF we could
do something of #packets != #interrupts.  And hopefully some
mechanism to adjust so that the amount of interrupts would
be lesser per packets (hand waving here).


I'm trying to do this now.


What scheme do you have in mind?


As I mentioned above, filling a ring completely appears to be almost
as bad as sending too many notifications. The ideal scheme may
involve trying to balance the ring at some "half-full" state,
depending on the capacity for the front- and backends to process
requests and responses.


I don't think filling the ring full causes any problem, that's just
conceptually the same as "half-full" state if you need to throttle the
ring.

My understanding was that filling the ring will cause the producer tosleep until slots become available (i.e. the until the consumer notifiesit that it has removed something from the ring).

I'm just concerned that overly aggressive batching may lead to asituation where the consumer is sitting idle, waiting for a notificationthat the producer hasn't yet sent because it can still fill more slotson the ring. When the ring is completely full, the producer would haveto wait for the ring to partially empty. At this point, the consumerwould hold off notifying because it can still batch more processing, sothe producer is left waiting. (Repeat as required). It would be betterto have both producer and consumer running concurrently.

I mention this mainly so that we don't end up with a swing to the polaropposite of what we have now, which (to my mind) is just as bad. Clearlythis is an edge case, but if there's a reason I'm missing that thiscan't happen (e.g. after a period of inactivity) then don't hesitate topoint it out :)

(Perhaps "half-full" was misleading... the optimal state may be "justenough room for one more packet", or something along those lines...)


Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Ian Campbell

References:
- [Xen-devel] Interesting observation with network event notification and batching
  - From: Wei Liu
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Wei Liu
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Ian Campbell
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Andrew Bennieston
- Re: [Xen-devel] Interesting observation with network event notification and batching
  - From: Wei Liu

Prev by Date: Re: [Xen-devel] [PATCH] gcov: Support gcc 4.7
Next by Date: [Xen-devel] Xen 4.3 development update
Previous by thread: Re: [Xen-devel] Interesting observation with network event notification and batching
Next by thread: Re: [Xen-devel] Interesting observation with network event notification and batching
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.