[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 02/70] xen/sort: Switch to an extern inline implementation
Hi, > On 16 Feb 2022, at 11:46, Julien Grall <julien@xxxxxxx> wrote: > > Hi, > > On 16/02/2022 10:44, Andrew Cooper wrote: >> On 16/02/2022 03:46, Stefano Stabellini wrote: >>> On Mon, 14 Feb 2022, Julien Grall wrote: >>>> On 14/02/2022 12:50, Andrew Cooper wrote: >>>>> There are exactly 3 callers of sort() in the hypervisor. Callbacks in a >>>>> tight >>>>> loop like this are problematic for performance, especially with Spectre v2 >>>>> protections, which is why extern inline is used commonly by libraries. >>>>> >>>>> Both ARM callers pass in NULL for the swap function, and while this might >>>>> seem >>>>> like an attractive option at first, it causes generic_swap() to be used, >>>>> which >>>>> forced a byte-wise copy. Provide real swap functions so the compiler can >>>>> optimise properly, which is very important for ARM downstreams where >>>>> milliseconds until the system is up matters. >>>> Did you actually benchmark it? Both those lists will have < 128 elements in >>>> them. So I would be extremely surprised if you save more than a few >>>> hundreds >>>> microseconds with this approach. >>>> >>>> So, my opinion on this approach hasn't changed. On v1, we discussed an >>>> approach that would suit both Stefano and I. Jan seemed to confirm that >>>> would >>>> also suit x86. >>> This patch series has become 70 patches and for the sake of helping >>> Andrew move forward in the quickest and most painless way possible, I >>> append the following using generic_swap as static inline. >>> >>> Julien, Bertrand, is that acceptable to you? >>> >>> Andrew, I know this is not your favorite approach but you have quite a >>> lot of changes to handle -- probably not worth focussing on one detail >>> which is pretty minor? >> It's not pretty minor. My version really is the best thing for ARM. > > >> The perf improvement alone, marginal as it may be in practice, is > > Our take on Arm has always been to avoid micro-optimization when the > resulting code is more difficult to maintain. > >> justification alone for the patch, and Bertrand's R-by is testament to this. > > I am not sure why calling out that Bertrand agreed means that everyone else > should accept your approach. > > This reminds me other series that have been blocked for a long time by you. > Yet you made no effort to compromise... How ironic. > >> But the reasons why getting rid the swap functions is important for >> CET-IBT on x86 are exactly the same as why getting rid of them on ARM >> will be important for BTI support. A tagged function doing an arbitrary >> bytewise swap from two parameters controlled by the third is far more >> valuable to an attackers gadget library than a typical function. > > This is a more compelling reason. However, I am a bit puzzled why it took you > so long to come up with this reason. > >> i.e. this proposed intermediary, if it compiles, is just busywork which >> someone else is going to have to revert in the future, along with having >> this argument again. > > Well, this argument would have never happened if your commit message > contained information about BTI. I agree that this would be nice to have in the commit message as a justification for the change. Cheers Bertrand > > Instead you decided to just mention the performance part despite me objecting > it and requesting for a different reason in v1 (see [1]). > > Anyway, I will wait for a reword of the commit message before lifting my > Nacked-by. > > Cheers, > > [1] > https://lore.kernel.org/xen-devel/f7bb7a08-4721-f2a8-8602-0cd1baf1f083@xxxxxxx/ > > -- > Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |