[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/4] mitigate the per-pCPU blocking list may be too long



On 02/05/17 06:45, Chao Gao wrote:
> On Wed, Apr 26, 2017 at 05:39:57PM +0100, George Dunlap wrote:
>> On 26/04/17 01:52, Chao Gao wrote:
>>> I compared the maximum of #entry in one list and #event (adding entry to
>>> PI blocking list) with and without the three latter patches. Here
>>> is the result:
>>> -------------------------------------------------------------
>>> |               |                      |                    |
>>> |    Items      |   Maximum of #entry  |      #event        |
>>> |               |                      |                    |
>>> -------------------------------------------------------------
>>> |               |                      |                    |
>>> |W/ the patches |         6            |       22740        |
>>> |               |                      |                    |
>>> -------------------------------------------------------------
>>> |               |                      |                    |
>>> |W/O the patches|        128           |       46481        |
>>> |               |                      |                    |
>>> -------------------------------------------------------------
>>
>> Any chance you could trace how long the list traversal took?  It would
>> be good for future reference to have an idea what kinds of timescales
>> we're talking about.
> 
> Hi.
> 
> I made a simple test to get the time consumed by the list traversal.
> Apply below patch and create one hvm guest with 128 vcpus and a passthrough 
> 40 NIC.
> All guest vcpu are pinned to one pcpu. collect data by
> 'xentrace -D -e 0x82000 -T 300 trace.bin' and decode data by
> xentrace_format. When the list length is about 128, the traversal time
> is in the range of 1750 cycles to 39330 cycles. The physical cpu's
> frequence is 1795.788MHz, therefore the time consumed is in the range of 1us
> to 22us. If 0.5ms is the upper bound the system can tolerate, at most
> 2900 vcpus can be added into the list.

Great, thanks Chao Gao, that's useful.  I'm not sure a fixed latency --
say 500us -- is the right thing to look at; if all 2900 vcpus arranged
to have interrupts staggered at 500us intervals it could easily lock up
the cpu for nearly a full second.  But I'm having trouble formulating a
good limit scenario.

In any case, 22us should be safe from a security standpoint*, and 128
should be pretty safe from a "make the common case fast" standpoint:
i.e., if you have 128 vcpus on a single runqueue, the IPI wake-up
traffic will be the least of your performance problems I should think.

 -George

* Waiting for Jan to contradict me on this one. :-)


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.