[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Converting heap page_infos to contiguous virtual



On 07/14/2016 06:34 AM, Andrew Cooper wrote:
> On 14/07/16 11:25, George Dunlap wrote:
>> On 13/07/16 21:57, Boris Ostrovsky wrote:
>>> On 07/13/2016 04:34 PM, Andrew Cooper wrote:
>>>> On 13/07/2016 21:17, Boris Ostrovsky wrote:
>>>>> On 07/13/2016 04:02 PM, Andrew Cooper wrote:
>>>>>> On 13/07/16 20:44, Boris Ostrovsky wrote:
>>>>>>> I would like to clear a bunch of Xen heap pages at once (i.e. not
>>>>>>> page-by-page).
>>>>>>>
>>>>>>> Greatly simplifying things, let's say I grab (in common/page_alloc.c)
>>>>>>>     pg = page_list_remove_head(&heap(node, zone, order)
>>>>>>>
>>>>>>> and then
>>>>>>>
>>>>>>>     mfn_t mfn =
>>>>>>> _mfn(page_to_mfn(pg));                                        
>>>>>>>     char *va = mfn_to_virt(mfn_x(mfn));
>>>>>>>     memset(va, 0, 4096 * (1 << order));
>>>>>>>
>>>>>>>
>>>>>>> Would it be valid to this?
>>>>>> In principle, yes.  The frame_table is in order.
>>>>>>
>>>>>> However, mfn_to_virt() will blow up for RAM above the 5TB boundary.  You
>>>>>> need to map_domain_page() to get a mapping.
>>>>> Right, but that would mean going page-by-page, which I want to avoid.
>>>>>
>>>>> Now, DIRECTMAP_SIZE is ~128TB (if my math is correct) --- doesn't it
>>>>> imply that it maps this big a range contiguously (modulo PDX hole)?
>>>> Your maths is correct, and yet you will end up with problems if you
>>>> trust it.
>>>>
>>>> That is the magic mode for the idle and monitor pagetables.  In the
>>>> context of a 64bit PV guest, the cutoff is at 5TB, at which point you
>>>> venture into the virtual address space reserved for guest kernel use. 
>>>> (It is rather depressing that the 64bit PV guest ABI is the factor
>>>> limiting Xen's maximum RAM usage.)
>>> I don't know whether it would make any difference but the pages that I am
>>> talking about are not in use by any guest, they are free. (This question
>>> is for scrubbing rewrite that I am working on. Which apparently you
>>> figured out judged by what you are saying below)
>> Is this start-of-day scrubbing (when there are no guests), or scrubbing
>> on guest destruction?
>>
>> If the former, it seems like it might not be too difficult to arrange
>> that we're in a context that has all the RAM mapped.
> This will be runtime scrubbing of pages.  

Actually, both. My prototype (apparently, mistakenly) assumed whole RAM
mapping and so I used the same clearing code during both system boot and
guest destruction.

In the former case with a 6TB box scrubbing time went from minutes to
seconds. This was a while ago so I don't remember exact numbers (or
whether the system had 6 or fewer TB). This was using AVX instructions.


> This topic has come up at
> several hackathons.
>
> Currently, domain destroy on a 1TB VM takes ~10 mins of synchronously
> scrubbing RAM in continuations of the domain_kill() hypercall (and those
> databases VMs really like their RAM).
>
> ISTR the plan was to have a page_info "dirty" flag and a dirty page list
> which is scrubbed while idle (or per-node, more likely). 
> alloc_{dom/xen}_heap_pages() can pull off the dirty or free list, doing
> a small synchronous scrub if it was dirty and needs to be clean. 
> domain_kill() can just do a pagelist_splice() to move all memory onto
> the dirty list, and save 10 minutes per TB.  The boot time memory scrub
> can then be implemented in terms of setting the dirty flag by default,
> rather than being an explicit step.
>
> (Although I really shouldn't be second-guessing what Boris is planning
> to implement ;p)

Not exactly this, but something along those lines.

-boris



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.