[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/4] xen: introduce an "scrub" free page list



On 06/17/2014 08:36 PM, Jan Beulich wrote:
>>>> On 17.06.14 at 13:49, <lliubbo@xxxxxxxxx> wrote:
>> Because of page scrubbing, it's very slow to destroy a domain with large 
>> memory.
>> It takes around 10 minutes when destroy a guest of nearly 1 TB of memory.
>>
>> This patch introduced a "scrub" free page list, pages on this list need to be
>> scrubbed before use. During domain destory just mark pages "PGC_need_scrub"
>> and then add them to this list, so that xl can return quickly.
>>
>> In alloc_domheap_pages():
>>   - If there are pages on the normal "heap" freelist, allocate them.
>>   - Try to allocate from the "scrub" free list with the same order and do
>>     scrubbing synchronously.
>>
>> In order to not fail high order allocate request, merge page trunks for
>> PGC_need_scrub pages.
>> Note: PCG_need_scrub pages and normal pages are not mergeable
> 
> The whole series needs sync-ing logically both with Konrad's recent
> boot time scrubbing changes (this change here really makes his
> change unnecessary - the affected pages could quite well be simply
> marked as needing scrubbing, getting dealt with by the logic added
> here) and with XSA-100's change.
> 

I agree. Konrad?

>> @@ -629,8 +632,24 @@ static struct page_info *alloc_heap_pages(
>>  
>>              /* Find smallest order which can satisfy the request. */
>>              for ( j = order; j <= MAX_ORDER; j++ )
>> +            {
>>                  if ( (pg = page_list_remove_head(&heap(node, zone, j))) )
>>                      goto found;
>> +                
>> +                /* Try to scrub a page directly */
>> +                if ( (pg = page_list_remove_head(&scrub(node, zone, j))) )
>> +                {
>> +                    for ( i = 0; i < (1 << j); i++ )
> 
> Definitely not: You may have found a 4Tb chunk for a request of a
> single page. You don't want to scrub all 4Tb in that case. I think this
> needs to be put after the "found" label.
> 

Thank you for pointing out.


> Which raises the question on the need for separate _heap[] and
> _scrub[] - if chunks with different PGC_need_scrub flags aren't
> mergeable, why do you need separate arrays?
> 

I also like use _heap[] only and it's much easier.

The main reason for an new _scrub[] list is we need a way to find all
need_scrub pages for idle vcpus.

It's too cost adding an new page_list_head to struct page_info and I
didn't find a member can be union in struct page_info.

If use _heap[] only, we have to iterate all pages in idle vcpus to check
whether this page need scrubbed. I think it's not effective.

Welcome better ideas.

>> +                    {
>> +                        if ( test_bit(_PGC_need_scrub, &(pg[i].count_info)) 
>> )
> 
> Please no pointless and inconsistent (with surrounding code)
> parentheses.
> 

I'm sorry for all of those issues, I'm not sure whether the direction of
this series is right so I didn't take careful checking.

>> +                        {
>> +                            scrub_one_page(&pg[i]);
>> +                            pg[i].count_info &= ~(PGC_need_scrub);
> 
> Again.
> 
>> +                    }
> 
> Hard tabs (also further down).
> 
>> @@ -859,6 +878,22 @@ static void free_heap_pages(
>>          midsize_alloc_zone_pages = max(
>>              midsize_alloc_zone_pages, total_avail_pages / 
>> MIDSIZE_ALLOC_FRAC);
>>  
>> +    if ( need_scrub )
>> +    {
>> +        /* Specail for tainted case */
>> +        if ( tainted )
>> +        {
>> +            for ( i = 0; i < (1 << order); i++ )
>> +                scrub_one_page(&pg[i]);
> 
> Are you trying to provoke another #MC for pages that got offlined?
> Just avoid setting PGC_need_scrub in this case and you're done.
> 

You are right, will be fixed.

>> @@ -1250,6 +1320,11 @@ void __init end_boot_allocator(void)
>>  #endif
>>      }
>>  
>> +    for ( i = 0; i < MAX_NUMNODES; i++ )
>> +        for ( j = 0; j < NR_ZONES; j++ )
>> +            for ( order = 0; order <= MAX_ORDER; order++ )
>> +                INIT_PAGE_LIST_HEAD(&scrub(i, j, order));
>> +
> 
> Why is this not being done alongside the setting up of _heap[]?
> 
>> @@ -1564,22 +1639,21 @@ void free_domheap_pages(struct page_info *pg, 
>> unsigned int order)
>>           * domain has died we assume responsibility for erasure.
>>           */
>>          if ( unlikely(d->is_dying) )
>> -            for ( i = 0; i < (1 << order); i++ )
>> -                scrub_one_page(&pg[i]);
>> -
>> -        free_heap_pages(pg, order);
>> +            free_heap_pages(pg, order, 1);
>> +        else
>> +            free_heap_pages(pg, order, 0);
>>      }
>>      else if ( unlikely(d == dom_cow) )
>>      {
>>          ASSERT(order == 0); 
>>          scrub_one_page(pg);
>> -        free_heap_pages(pg, 0);
>> +        free_heap_pages(pg, 0, 0);
> 
> Is there a particular reason why you don't defer the scrubbing here?
> 

Will be fixed.

Thanks,
-Bob

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.