[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] xen: use idle vcpus to scrub pages

On 07/25/2014 03:36 PM, Jan Beulich wrote:
>>>> On 25.07.14 at 09:28, <bob.liu@xxxxxxxxxx> wrote:
>> On 07/25/2014 02:51 PM, Jan Beulich wrote:
>>>>>> On 25.07.14 at 02:42, <bob.liu@xxxxxxxxxx> wrote:
>>>> On 07/24/2014 02:24 PM, Jan Beulich wrote:
>>>>>>>> On 24.07.14 at 04:08, <bob.liu@xxxxxxxxxx> wrote:
>>>>>> On 07/23/2014 03:28 PM, Jan Beulich wrote:
>>>>>>>>>> On 15.07.14 at 11:16, <bob.liu@xxxxxxxxxx> wrote:
>>>>>>>> After so many days I haven't make a workable solution if don't remove
>>>>>>>> pages temporarily. The hardest part is iterating the heap free list
>>>>>>>> without holding heap_lock because if holding the lock it might be heavy
>>>>>>>> lock contention.
>>>>>>>> So do you think it's acceptable if fixed all other concerns about this
>>>>>>>> patch?
>>>>>>> No, I don't think so. Instead I'm of the opinion that you may have
>>>>>>> worked in the wrong direction: Rather than not taking the heap lock
>>>>>>> at all, it may also be sufficient to shrink the lock holding time (i.e.
>>>>>>> avoid long loops with the lock held).
>>>>>> But I still think have to drop pages from heap list temporarily else
>>>>>> heap lock must be taken for a long time to get rid of E.g. below race
>>>>>> condition.
>>>>>> A: alloc path            B: idle loop
>>>>>>                         spin_lock(&heap_lock)
>>>>>>                       page_list_for_each( pg, &heap(node, zone, order) )
>>>>>>                         if _PGC_need_scrub is set, break;
>>>>>>                         spin_unlock(&heap_lock)
>>>>>>                         if ( test_bit(_PGC_need_scrub, pg)
>>>>>> ^^^^
>>>>>> spin_lock(&heap_lock)
>>>>>> delist page
>>>>>> spin_unlock(&heap_lock)
>>>>>> write data to this page
>>>>>>                          scrub_one_page(pg)
>>>>>>                          ^^^ will clean useful data
>>>>> No (and I'm sure I said so before): The only problem is with the
>>>>> linked list itself; the page contents are not a problem - the
>>>>> allocation path can simply wait for the already suggested
>>>>> _PGC_scrubbing flag to clear before returning. And as already
>>>> The page contents are a problem if the race condition I mentioned in
>>>> previous email happen.
>>>> Because there is a time window between checking the PGC_need_scrub flag
>>>> and doing the real scrub in idle thread, the idle thread will still
>>>> scrub a page after that page have been allocated by allocation path and
>>>> been used(and have been written some useful data).
>>> Did you really read all of my previous reply?
>> Sure, may be I misunderstood your reply.
>> If the allocation path can wait for the flag there is no problem, but I
>> remember you suggested to do the scrubbing also in allocation path in
>> which case I think this race condition will happen.
> Two CPUs scrubbing the same page simultaneously is not a
> correctness problem, only a performance one: The ultimate result
> is a scrubbed page, but there may be unnecessary bus contention.

Thank you for your patient!

If CPU0 scrubbed a page in allocation path and then that page was
allocated successfully and then been written some data, but CPU1 didn't
recognize this situation and will scrub the same page again which will
clear the useful data by mistake.

The race condition is like this:

CPU0:                   CPU1:
allocation path         idle vcpu thread

                        find a need_scrub page from heap list
                        if ( test_bit(_PGC_need_scrub) )
delist the same page
scrub the page

write date to this
page when an user
get this page

                           scrub the page
                           clear need_scrub flag

When CPU1 try to do the scrubbing the page has already been allocated
and written by CPU0, that's a serious bad situation.

Currently I still don't have better idea to get rid of all of those
problems if not dropping pages from heap list temporarily.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.