[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86/mm: also flush TLB when putting writable foreign page reference

On 03/05/17 08:21, Jan Beulich wrote:
>>>> On 02.05.17 at 19:37, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 02/05/17 10:43, Tim Deegan wrote:
>>> At 02:50 -0600 on 02 May (1493693403), Jan Beulich wrote:
>>>>>>> On 02.05.17 at 10:32, <tim@xxxxxxx> wrote:
>>>>> At 04:52 -0600 on 28 Apr (1493355160), Jan Beulich wrote:
>>>>>>>>> On 27.04.17 at 11:51, <tim@xxxxxxx> wrote:
>>>>>>> At 03:23 -0600 on 27 Apr (1493263380), Jan Beulich wrote:
>>>>>>>> ... it wouldn't better be the other way around: We use the patch
>>>>>>>> in its current (or even v1) form, and try to do something about
>>>>>>>> performance only if we really find a case where it matters. To be
>>>>>>>> honest, I'm not even sure how I could meaningfully measure the
>>>>>>>> impact here: Simply counting how many extra flushes there would
>>>>>>>> end up being wouldn't seem all that useful, and whether there
>>>>>>>> would be any measurable difference in the overall execution time
>>>>>>>> of e.g. domain creation I would highly doubt (but if it's that what
>>>>>>>> you're after, I could certainly collect a few numbers).
>>>>>>> I think that would be a good idea, just as a sanity-check.
>>>>>> As it turns out there is a measurable effect: xc_dom_boot_image()
>>>>>> for a 4Gb PV guest takes about 70% longer now. Otoh it is itself
>>>>>> responsible for less than 10% of the overall time libxl__build_dom()
>>>>>> takes, and that in turn is only a pretty small portion of the overall
>>>>>> "xl create".
>>>>> Do you think that slowdown is OK?  I'm not sure -- I'd be inclined to
>>>>> avoid it, but could be persuaded, and it's not me doing the work. :)
>>>> Well, if there was a way to avoid it in a clean way without too much
>>>> code churn, I'd be all for avoiding it. The avenues we've explored so
>>>> far either didn't work (using pg_owner's dirty mask) or didn't promise
>>>> to actually reduce the flush overhead in a meaningful way (adding a
>>>> separate mask to be merged into the mask used for the flush in
>>>> __get_page_type()), unless - as has been the case before - I didn't
>>>> fully understand your thoughts there.
>>> Quoting your earlier response:
>>>> Wouldn't it suffice to set bits in this mask in put_page_from_l1e()
>>>> and consume/clear them in __get_page_type()? Right now I can't
>>>> see it being necessary for correctness to fiddle with any of the
>>>> other flushes using the domain dirty mask.
>>>> But then again this may not be much of a win, unless the put
>>>> operations come through in meaningful batches, not interleaved
>>>> by any type changes (the latter ought to be guaranteed during
>>>> domain construction and teardown at least, as the guest itself
>>>> can't do anything at that time to effect type changes).
>>> I'm not sure how much batching there needs to be.  I agree that the
>>> domain creation case should work well though.  Let me think about the
>>> scenarios when dom B is live:
>>> 1. Dom A drops its foreign map of page X; dom B immediately changes the
>>> type of page X.  This case isn't helped at all, but I don't see any
>>> way to improve it -- dom A's TLBs need to be flushed right away.
>>> 2. Dom A drops its foreign map of page X; dom B immediately changes
>>> the type of page Y.  Now dom A's dirty CPUs are in the new map, but B
>>> may not need to flush them right away.  B can filter by page Y's
>>> timestamp, and flush (and clear) only some of the cpus in the map.
>>> So that seems good, but then there's a risk that cpus never get
>>> cleared from the map, and __get_page_type() ends up doing a lot of
>>> unnecessary work filtering timestaps.  When is it safe to remove a CPU
>>> from that map?
>>>  - obvs safe if we IPI it to flush the TLB (though may need memory
>>>    barriers -- need to think about a race with CPU C putting A _into_
>>>    the map at the same time...)
>>>  - we could track the timestamp of the most recent addition to the
>>>    map, and drop any CPU whose TLB has been flushed since that,
>>>    but that still lets unrelated unmaps keep CPUs alive in the map...
>>>  - we could double-buffer the map: always add CPUs to the active map;
>>>    from time to time, swap maps and flush everything in the non-active
>>>    map (filtered by the TLB timestamp when we last swapped over).
>>> Bah, this is turning into a tar pit.  Let's stick to the v2 patch as
>>> being (relatively) simple and correct, and revisit this if it causes
>>> trouble. :)
>> :(
>> A 70% performance hit for guest creation is certainly going to cause
>> problems, but we obviously need to prioritise correctness in this case.
> Hmm, you did understand that the 70% hit is on a specific sub-part
> of the overall process, not guest creation as a whole? Anyway,
> your reply is neither an ack nor a nak nor an indication of what needs
> to change ...

Yes - I realise it isn't all of domain creation, but this performance
hit will also hit migration, qemu DMA mappings, etc.

XenServer has started a side-by-side performance work-up of this change,
as presented at the root of this thread.  We should hopefully have some
number in the next day or two.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.