[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 7/8] lib: move bsearch code
On 24.11.2020 01:40, Andrew Cooper wrote: > On 23/11/2020 22:49, Julien Grall wrote: >> On 19/11/2020 10:27, Jan Beulich wrote: >>> On 18.11.2020 19:09, Julien Grall wrote: >>>> On 23/10/2020 11:19, Jan Beulich wrote: >>>>> --- a/xen/include/xen/compiler.h >>>>> +++ b/xen/include/xen/compiler.h >>>>> @@ -12,6 +12,7 @@ >>>>> #define inline __inline__ >>>>> #define always_inline __inline__ __attribute__ >>>>> ((__always_inline__)) >>>>> +#define gnu_inline __inline__ __attribute__ ((__gnu_inline__)) >>>> >>>> bsearch() is only used by Arm and I haven't seen anyone so far >>>> complaining about the perf of I/O emulation. >>>> >>>> Therefore, I am not convinced that there is enough justification to >>>> introduce a GNU attribute just for this patch. >>> >>> Please settle this with Andrew: He had asked for the function to >>> become inline. I don't view making it static inline in the header >>> as an option here - if the compiler decides to not inline it, we >>> should not end up with multiple instances in different CUs. >> >> That's the cons of static inline... but then why is it suddenly a >> problem with this helper? >> >>> And >>> without making it static inline the attribute needs adding; at >>> least I'm unaware of an alternative which works with the various >>> compiler versions. >> >> The question we have to answer is: What is the gain with this approach? > > Substantial. > >> >> If it is not quantifiable, then introducing compiler specific >> attribute is not an option. >> >> IIRC, there are only two callers (all in Arm code) of this function. >> Even inlined, I don't believe you would drastically reduce the number >> of instructions compare to a full blown version. To be generous, I >> would say you may save ~20 instructions per copy. >> >> Therefore, so far, the compiler specific attribute doesn't look >> justified to me. As usual, I am happy to be proven wrong > > There is a very good reason why this is the classic example used for > extern inline's in various libc's. > > The gains are from the compiler being able to optimise away the function > pointer(s) entirely. Instead of working on opaque objects, it can see > the accesses directly, implement compares as straight array reads, (for > sorting, the swap() call turns into memcpy()) and because it can see all > the memory accesses, doesn't have to assume that every call to cmp() > modifies arbitrary data in the array (i.e. doesn't have to reload the > objects from memory every iteration). > > extern inline allows the compiler full flexibility to judge whether > inlining is a net win, based on optimisation settings and observing what > the practical memory access pattern would be from not inlining. > > extern inline is the appropriate thing to use here, except for the big > note in the GCC manual saying "always use gnu_inline in this case" which > appears to be working around a change in the C99 standard which forces > any non-static inline to emit a body even when its not called, due to > rules about global symbols. > > Therefore, Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Thanks Andrew. Julien - please clarify whether you're okay with Andrew's response, or whether you continue to object the conversion to inline. > On a totally separate point, I wonder if we'd be better off compiling > with -fgnu89-inline because I can't see any case we're we'd want the C99 > inline semantics anywhere in Xen. I'm not sure about this, i.e. I wouldn't want to exclude such a case appearing. I think using attributes is better in general, as it allows fine grained control. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |