[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/domctl: lower loglevel of XEN_DOMCTL_memory_mapping



On 11/09/15 12:11, Jan Beulich wrote:
>>>> On 11.09.15 at 12:28, <malcolm.crossley@xxxxxxxxxx> wrote:
>> On 11/09/15 10:17, Jan Beulich wrote:
>>>>>> On 11.09.15 at 02:59, <konrad.wilk@xxxxxxxxxx> wrote:
>>>> If you want a formula I would do:
>>>>
>>>> #define MAX_SOCKETS 8
>>>>
>>>>  max_pfns = pow(2,(MAX_SOCKETS - (max(nr_iommus(), MAX_SOCKETS)))) * 64;
>>>>
>>>> Where nr_iommus would have to be somehow implemented, ditto for pow.
>>>>
>>>> This should give you:
>>>>  8 -> 64
>>>>  7 -> 128
>>>>  6 -> 256
>>>>  5 -> 512
>>>>  4 -> 1024
>>>>  3 -> 2048
>>>>  2 -> 4096
>>>>  1 -> 16384
>>>
>>> 16k seems excessive as a default. Also - why would this be related
>>> to the number of sockets? I don't think there's a one-IOMMU-per-
>>> socket rule; fixed-number-of-IOMMUs-per-node might come closer,
>>> but there we'd have the problem of what "fixed number" is. Wouldn't
>>> something as simple as 1024 / nr_iommus() do?
>>>
>>> I also don't follow what cache flushes you talked about earlier: I
>>> don't think the IOMMUs drive any global cache flushes, and I
>>> would have thought the size limited IOTLB and (CPU side) cache
>>> ones should be pretty limited in terms of bus load (unless the TLB
>>> ones would get converted to global ones due to lacking IOMMU
>>> capabilities). Is that not the case?
>>
>> The data cache flushes are caused by the memory_type_changed()
>> call at the bottom of the XEN_DOMCTL_memory_mapping hypercall,
>> not by the IOMMU code itself.
> 
> In which case - contrary to what Konrad said he measured - their
> impact on overall throughput shouldn't scale with socket (or node)
> count (unless the hardware implementation of the flushes is bad).
> 

The flush_all(FLUSH_CACHE) in mtrr.c will result in a flush_area_mask for all 
CPU's in the host.
It will more time to issue a IPI to all logical cores the more core's there 
are. I admit that
x2apic_cluster mode may speed this up but not all hosts will have that enabled.

The data flush will force all data out to memory controllers and it's possible 
that CPU's in
difference package have cached data all corresponding to a particular memory 
controller which will
become a bottleneck.

In worst case, with large delay between XEN_DOMCTL_memory_mapping hypercalls 
and on a 8 socket
system you may end up writing out 45MB (L3 cache) * 8 = 360MB to a single 
memory controller every 64
pages (256KiB) of domU p2m updated.

Malcolm


> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.