[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] x86/x2apic: introduce a mixed physical/cluster mode



On Fri, Nov 03, 2023 at 03:44:30PM +0000, Andrew Cooper wrote:
> On 03/11/2023 3:34 pm, Roger Pau Monné wrote:
> > On Fri, Nov 03, 2023 at 03:10:18PM +0000, Andrew Cooper wrote:
> >> On 03/11/2023 2:45 pm, Roger Pau Monne wrote:
> >>> when sending those, as multiple CPUs can be targeted with a single ICR 
> >>> register
> >>> write.
> >>>
> >>> A simple test calling flush_tlb_all() 10000 times in a tight loop on a 96 
> >>> CPU
> >>> box gives the following average figures:
> >>>
> >>> Physical mode: 26617931ns
> >>> Mixed mode:    23865337ns
> >>>
> >>> So ~10% improvement versus plain Physical mode.  Note that Xen uses 
> >>> Cluster
> >>> mode by default, and hence is already using the fastest way for IPI 
> >>> delivery at
> >>> the cost of reducing the amount of vectors available system-wide.
> >> 96 looks suspiciously like an Intel number.  In nothing else, you ought
> >> to say which CPU is it, because microarchitecture matters.  By any
> >> chance can we try this on one of the Bergamos, to give us a datapoint at
> >> 512?
> > Let me see if I can grab the only one that's not broken.
> >
> > Those figures are from an Intel IceLake IIRC.  Cluster mode is the
> > default, so this change shouldn't effect the performance of builds
> > that use the default settings.
> 
> "shouldn't" being the operative word.
> 
> You're presenting evidence to try and convince the reader that the
> reasoning is correct.
> 
> Therefore, it is important to confirm your assumptions, and that means
> measuring and reporting all 3.
> 
> Part of the analysis should say "we expected mixed and cluster to be the
> same, and the results show that".
> 
> And obviously, if mixed and cluster are wildly different, then we take a
> step back and think harder.

If they are different I'm definitely not sending the patch :).

> >>> +};
> >>> +
> >>>  static int cf_check update_clusterinfo(
> >>>      struct notifier_block *nfb, unsigned long action, void *hcpu)
> >>>  {
> >>> @@ -220,38 +248,56 @@ static struct notifier_block x2apic_cpu_nfb = {
> >>>  static int8_t __initdata x2apic_phys = -1;
> >>>  boolean_param("x2apic_phys", x2apic_phys);
> >>>  
> >>> +enum {
> >>> +   unset, physical, cluster, mixed
> >>> +} static __initdata x2apic_mode = unset;
> >>> +
> >>> +static int __init parse_x2apic_mode(const char *s)
> >> cf_check
> > I'm probably confused, but other users of custom_param() do have the
> > cf_check attribute, see parse_spec_ctrl() for example.
> 
> Yes, and this function needs one but is missing it.
> 
> cf_check equates to "This function needs an ENDBR", which it does
> because it's invoked by function pointer.
> 
> It likely doesn't expode on a CET-active machine because this logic runs
> prior to turning CET-IBT on, and you'll only get a build error in the
> buster-ibt pipeline which has a still-unupstreamed GCC patch.

That was my guess, that CET wasn't yet active by the time this is
called.

For consistency we should fix all handlers of custom_param() (and
other command line parsing helpers) so they uniformly use cf_check,
otherwise it's likely I will make the same mistake when copy&paste to
introduce a new option.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.