[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 06/12] xen/x86: populate PVHv2 Dom0 physical memory map



On 05/08/16 10:40, Roger Pau Monne wrote:
> On Thu, Aug 04, 2016 at 07:43:39PM +0100, Andrew Cooper wrote:
>> On 02/08/16 10:19, Roger Pau Monne wrote:
>>> On Fri, Jul 29, 2016 at 08:04:12PM +0100, Andrew Cooper wrote:
>>>> On 29/07/16 17:29, Roger Pau Monne wrote:
>>>>> +/* Calculate the biggest usable order given a size in bytes. */
>>>>> +static inline unsigned int get_order(uint64_t size)
>>>>> +{
>>>>> +    unsigned int order;
>>>>> +    uint64_t pg;
>>>>> +
>>>>> +    ASSERT((size & ~PAGE_MASK) == 0);
>>>>> +    pg = PFN_DOWN(size);
>>>>> +    for ( order = 0; pg >= (1 << (order + 1)); order++ );
>>>>> +
>>>>> +    return order;
>>>>> +}
>>>> We already have get_order_from_bytes() and get_order_from_pages(), the
>>>> latter of which looks like it will suit your usecase.
>>> Not really, or at least they don't do the same as get_order. This function 
>>> calculates the maximum order you can use so that there are no pages left 
>>> over, (ie: if you have a size of 3145728bytes (3MiB), this function will 
>>> return order 9 (2MiB), while the other ones will return order 10 (4MiB)). I 
>>> don't really understand while other places in code request bigger orders 
>>> and 
>>> then free the leftovers, isn't this also causing memory shattering?
>> Sounds like we want something like get_order_{floor,ceil}() which makes
>> it obvious which way non-power-of-two get rounded.
> Right, that makes sense, will rename the current one to ceil, and add the 
> floor variant.
>
>>>>> +            if ( order == 0 && memflags )
>>>>> +            {
>>>>> +                /* Try again without any memflags. */
>>>>> +                memflags = 0;
>>>>> +                order = MAX_ORDER;
>>>>> +                continue;
>>>>> +            }
>>>>> +            if ( order == 0 )
>>>>> +                panic("Unable to allocate memory with order 0!\n");
>>>>> +            order--;
>>>>> +            continue;
>>>>> +        }
>>>> It would be far more efficient to try and allocate only 1G and 2M
>>>> blocks.  Most of memory is free at this point, and it would allow the
>>>> use of HAP superpage mappings, which will be a massive performance boost
>>>> if they aren't shattered.
>>> That's what I'm trying to do, but we might have to use pages of lower order 
>>> when filling the smaller gaps.
>> As a general principle, we should try not to have any gaps.  This also
>> extends to guests using more intelligence when deciding now to mutate
>> its physmap.
> Yes, but in this case we are limited by the original e820 from the host.
> A DomU (without passthrough) will have all it's memory contiguously.

Ah yes - that is a legitimate restriction.

>  
>>>  As an example, this are the stats when 
>>> building a domain with 6048M of RAM:
>>>
>>> (XEN) Memory allocation stats:
>>> (XEN) Order 18: 5GB
>>> (XEN) Order 17: 512MB
>>> (XEN) Order 15: 256MB
>>> (XEN) Order 14: 128MB
>>> (XEN) Order 12: 16MB
>>> (XEN) Order 10: 8MB
>>> (XEN) Order  9: 4MB
>>> (XEN) Order  8: 2MB
>>> (XEN) Order  7: 1MB
>>> (XEN) Order  6: 512KB
>>> (XEN) Order  5: 256KB
>>> (XEN) Order  4: 128KB
>>> (XEN) Order  3: 64KB
>>> (XEN) Order  2: 32KB
>>> (XEN) Order  1: 16KB
>>> (XEN) Order  0: 4KB
>>>
>>> IMHO, they are quite good.
>> What are the RAM characteristics of the host?  Do you have any idea what
>> the hap superpage characteristics are like after the guest has booted?
> This is the host RAM map:
>
> (XEN)  0000000000000000 - 000000000009c800 (usable)
> (XEN)  000000000009c800 - 00000000000a0000 (reserved)
> (XEN)  00000000000e0000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000ad662000 (usable)
> (XEN)  00000000ad662000 - 00000000adb1f000 (reserved)
> (XEN)  00000000adb1f000 - 00000000b228b000 (usable)
> (XEN)  00000000b228b000 - 00000000b2345000 (reserved)
> (XEN)  00000000b2345000 - 00000000b236a000 (ACPI data)
> (XEN)  00000000b236a000 - 00000000b2c9a000 (ACPI NVS)
> (XEN)  00000000b2c9a000 - 00000000b2fff000 (reserved)
> (XEN)  00000000b2fff000 - 00000000b3000000 (usable)
> (XEN)  00000000b3800000 - 00000000b8000000 (reserved)
> (XEN)  00000000f8000000 - 00000000fc000000 (reserved)
> (XEN)  00000000fec00000 - 00000000fec01000 (reserved)
> (XEN)  00000000fed00000 - 00000000fed04000 (reserved)
> (XEN)  00000000fed1c000 - 00000000fed20000 (reserved)
> (XEN)  00000000fee00000 - 00000000fee01000 (reserved)
> (XEN)  00000000ff000000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 0000000247000000 (usable)
>
> No idea about the HAP superpage characteristics, how can I fetch this 
> information? (I know I can dump the guest EPT tables, but that just 
> saturates the console).

Not easily, and also not the first time I have run into this problem. 
We really should have at least a debug way of identifying this.

For now, you can count how many time ept_split_super_page() get called
at each level.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.