[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 11/11] x86/bitops: Use the POPCNT instruction when available
On 30.08.2024 22:03, Andrew Cooper wrote: > On 29/08/2024 3:36 pm, Jan Beulich wrote: >> On 29.08.2024 00:03, Andrew Cooper wrote: >>> It has existed in x86 CPUs since 2008, so we're only 16 years late adding >>> support. With all the other scafolding in place, implement arch_hweightl() >>> for x86. >>> >>> The only complication is that the call to arch_generic_hweightl() is behind >>> the compilers back. Address this by writing it in ASM and ensure that it >>> preserves all registers. >>> >>> Copy the code generation from generic_hweightl(). It's not a complicated >>> algorithm, and is easy to regenerate if needs be, but cover it with the same >>> unit tests as test_generic_hweightl() just for piece of mind. >>> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>> --- >>> CC: Jan Beulich <JBeulich@xxxxxxxx> >>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >>> >>> v2: >>> * Fix MISRA 8.2 (parameter name) and 8.5 (single declaration) regressions. >>> * Rename {arch->x86}-generic-hweightl.{S->c} >>> * Adjust ASM formating >>> >>> The __constructor trick to cause any reference to $foo() to pull in >>> test_$foo() only works when both are in the same TU. i.e. what I did in >>> v1 (putting test_arch_generic_hweightl() in the regular generic-hweightl.c) >>> didn't work. >> I'm afraid I don't understand this. What exactly didn't work, breaking in >> which >> way? Presumably as much as you, I don't really like the global asm() in a C >> file, when ideally the same could be written with less clutter in an assembly >> one. > > We have lib-y because we wish to not include arch_generic_hweightl() if > it isn't referenced. > > But, test_arch_generic_hweightl() unconditionally references > arch_generic_hweightl() (in CONFIG_SELF_TESTS builds). Which I assumed was intentional: Always have the tests, to make sure the code doesn't go stale (or even break). Looking at the v2 code again I notice that while now you're running the tests only when the to-be-tested function is referenced from elsewhere, the testing has become independent of CONFIG_SELF_TESTS. I don't think that was intended? > Both {test_,}arch_generic_hweightl() do need to be in the same TU for > this to work (nicely), and I'm very uninterested in writing > test_arch_generic_hweightl() in asm. Well - surely absolutely no question about this last aspect. >>> This in turn means that arch_generic_hweightl() needs writing in a global >>> asm >>> block, and also that we can't use FUNC()/END(). While we could adjust it to >>> work for GCC/binutils, we can't have CPP macros in Clang-IAS strings. >> What does Clang different from gcc there? I was hoping that at least their >> pre- >> processors would work in (sufficiently) similar ways. > > It's inside a string, not outside, so CPP on the C file does nothing. > > Passing CPP over the intermediate .S would work, but we don't have an > intermediate .S with Clang-IAS. There's never an intermediate .S; it's always .s (not subject to C pre- processing). Therefore I continue to struggle with how CPP macros would come into (inconsistent) play here, unless they were used in construction of the string literal (and then work similarly for gcc and Clang). But as you say, we want to avoid playing with the intermediate file in any event. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |