[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 5/7] xen/x86: disable global pages for domains with XPTI active
On 23/03/18 09:14, Jan Beulich wrote: >>>> On 23.03.18 at 08:58, <jgross@xxxxxxxx> wrote: >> On 23/03/18 08:46, Jan Beulich wrote: >>>>>> On 22.03.18 at 19:18, <jgross@xxxxxxxx> wrote: >>>> On 22/03/18 17:30, Jan Beulich wrote: >>>>>>>> On 21.03.18 at 13:51, <jgross@xxxxxxxx> wrote: >>>>>> Instead of flushing the TLB from global pages when switching address >>>>>> spaces with XPTI being active just disable global pages via %cr4 >>>>>> completely when a domain subject to XPTI is active. This avoids the >>>>>> need for extra TLB flushes as loading %cr3 will remove all TLB >>>>>> entries. >>>>> >>>>> I continue to be not entirely convinced of this move. I had an >>>>> alternative in mind: Since retaining global pages is particularly >>>>> relevant for switches between guest user and guest kernel >>>>> modes, what if we made a shortcut from e.g. lstar_enter through >>>>> switch_to_kernel to restore_all_guest without ever switching to >>>>> the full page Xen tables? >>>> >>>> With patch 7 of this series in mind I'm not convinced the extra effort >>>> is really making sense. Today most processors do have PCID support so >>>> for that old hardware I don't think we need to make the handling even >>>> more complex. >>> >>> PCID yes, but INVPCID (and we're talking about Intel hardware >>> here only anyway)? But yes, the extra complexity is what has kept >>> me so far from investing time here. >> >> As PCID seems to speed up XPTI only and XPTI is necessary for INTEL cpus >> only I don't see a problem here. :-) > > Well, yes as far as AMD is unaffected, but not entirely yes to the > rest, as I intentionally pointed out the difference of availability of > PCID (which even Westmere's have) and INVPCID (which only my > Haswell has). Haswells are out for nearly 5 years now. I think in case we want to do something else here we should delay that to 4.12. Even without PCID this patch is speeding up XPTI handling significantly (parallel hypervisor compilation: elapsed time -6%, system time -12%), so I'm not making anything worse. BTW: while Westmere is supporting PCID I remember having done some PCID testing with Westmere not showing any performance gains at all. > >>>>>> --- a/xen/arch/x86/mm.c >>>>>> +++ b/xen/arch/x86/mm.c >>>>>> @@ -508,18 +508,23 @@ void make_cr3(struct vcpu *v, mfn_t mfn) >>>>>> void write_ptbase(struct vcpu *v) >>>>>> { >>>>>> struct cpu_info *cpu_info = get_cpu_info(); >>>>>> + unsigned long new_cr4; >>>>>> + >>>>>> + new_cr4 = (is_pv_vcpu(v) && !is_idle_vcpu(v)) >>>>>> + ? pv_guest_cr4_to_real_cr4(v) : mmu_cr4_features; >>>>> >>>>> I'm not overly happy to see any new uses of mmu_cr4_features. >>>>> This should really only be used for priming certain values imo, >>>>> which isn't the case here (otoh pv_guest_cr4_to_real_cr4() does >>>>> so too, and perhaps better wouldn't). Hence I wonder whether >>>>> this shouldn't be read_cr4() | X86_CR4_PGE, not the least >>>>> because we've just got rid of the blanket reversion to >>>>> mmu_cr4_features in VMX code. >>>> >>>> I do understand that using mmu_cr4_features isn't the best way to set >>>> cr4. But I think it is a good idea to have a default value which should >>>> normally be used instead of only switching various bits on and off. >>>> >>>> In case cr4 is loaded with a strange value in some corner case that >>>> value might be used from then on instead of being repaired by loading a >>>> dedicated value at certain points in time, e.g. when doing a context >>>> switch. >>> >>> But that would make it even more difficult to notice and debug a >>> possible problem. The more we play with CR4 bits, the more >>> important it is that we keep an accurate record of what is currently >>> loaded into it, and that we have a clear understanding of what we >>> mean to be loaded into the register at any given point in time. >> >> What about adding an appropriate ASSERT() for that case? >> >> So I would use read_cr4() | X86_CR4_PGE and ASSERT() that cr4 matches >> the default value. > > That's an option; later we may want to sprinkle around a few more > such assertions. Okay, I'll go that route then. Do you want me to use a new default_cr4 variable for doing the assertion (and as initial cr4 value for secondary cpus), or are you fine with using mmu_cr4_features for now? Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |