[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.14 v4] x86/tlb: fix assisted flush usage
On 26.06.2020 17:57, Roger Pau Monne wrote: > Commit e9aca9470ed86 introduced a regression when avoiding sending > IPIs for certain flush operations. Xen page fault handler > (spurious_page_fault) relies on blocking interrupts in order to > prevent handling TLB flush IPIs and thus preventing other CPUs from > removing page tables pages. Switching to assisted flushing avoided such > IPIs, and thus can result in pages belonging to the page tables being > removed (and possibly re-used) while __page_fault_type is being > executed. > > Force some of the TLB flushes to use IPIs, thus avoiding the assisted > TLB flush. Those selected flushes are the page type change (when > switching from a page table type to a different one, ie: a page that > has been removed as a page table) and page allocation. This sadly has > a negative performance impact on the pvshim, as less assisted flushes > can be used. Note the flush in grant-table code is also switched to > use an IPI even when not strictly needed. This is done so that a > common arch_flush_tlb_mask can be introduced and always used in common > code. > > Introduce a new flag (FLUSH_FORCE_IPI) and helper to force a TLB flush > using an IPI (flush_tlb_mask_sync, x86 only). Note that the flag is > only meaningfully defined when the hypervisor supports PV or shadow > paging mode, as otherwise hardware assisted paging domains are in > charge of their page tables and won't share page tables with Xen, thus > not influencing the result of page walks performed by the spurious > fault handler. > > Just passing this new flag when calling flush_area_mask prevents the > usage of the assisted flush without any other side effects. > > Note the flag is not defined on Arm. > > Fixes: e9aca9470ed86 ('x86/tlb: use Xen L0 assisted TLB flush when available') > Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> In principle Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> A few cosmetic remarks though: > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -2894,7 +2894,17 @@ static int _get_page_type(struct page_info *page, > unsigned long type, > ((nx & PGT_type_mask) == PGT_writable_page)) ) > { > perfc_incr(need_flush_tlb_flush); > - flush_tlb_mask(mask); > + if ( (x & PGT_type_mask) && > + (x & PGT_type_mask) <= PGT_root_page_table ) > + /* > + * If page was a page table make sure the flush is > + * performed using an IPI in order to avoid changing > + * the type of a page table page under the feet of > + * spurious_page_fault. > + */ > + flush_tlb_mask_sync(mask); > + else > + flush_tlb_mask(mask); Effectively this now is the only user of the new macro. I'd prefer avoiding its introduction (and hence avoiding the questionable "_sync" suffix), doing flush_mask(mask, FLUSH_TLB | (... ? FLUSH_FORCE_IPI : 0)); here and ... > @@ -148,9 +158,24 @@ void flush_area_mask(const cpumask_t *, const void *va, > unsigned int flags); > /* Flush specified CPUs' TLBs */ > #define flush_tlb_mask(mask) \ > flush_mask(mask, FLUSH_TLB) > +/* > + * Flush specified CPUs' TLBs and force the usage of an IPI to do so. This is > + * required for certain operations that rely on page tables themselves not > + * being freed and reused when interrupts are blocked, as the flush IPI won't > + * be fulfilled until exiting from that critical region. > + */ > +#define flush_tlb_mask_sync(mask) \ > + flush_mask(mask, FLUSH_TLB | FLUSH_FORCE_IPI) > #define flush_tlb_one_mask(mask,v) \ > flush_area_mask(mask, (const void *)(v), FLUSH_TLB|FLUSH_ORDER(0)) > > +/* > + * Alias the common code TLB flush helper to the sync one in order to be on > the > + * safe side. Note that not all calls from common code strictly require the > + * _sync variant. > + */ > +#define arch_flush_tlb_mask flush_tlb_mask_sync ... #define arch_flush_tlb_mask(mask) \ flush_mask(mask, FLUSH_TLB | FLUSH_FORCE_IPI) here. I'd be okay making these adjustments while committing, if you and others don't object. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |