|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 11/11] x86/bitops: Use the POPCNT instruction when available
On 29/08/2024 3:36 pm, Jan Beulich wrote:
> On 29.08.2024 00:03, Andrew Cooper wrote:
>> It has existed in x86 CPUs since 2008, so we're only 16 years late adding
>> support. With all the other scafolding in place, implement arch_hweightl()
>> for x86.
>>
>> The only complication is that the call to arch_generic_hweightl() is behind
>> the compilers back. Address this by writing it in ASM and ensure that it
>> preserves all registers.
>>
>> Copy the code generation from generic_hweightl(). It's not a complicated
>> algorithm, and is easy to regenerate if needs be, but cover it with the same
>> unit tests as test_generic_hweightl() just for piece of mind.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>
>> v2:
>> * Fix MISRA 8.2 (parameter name) and 8.5 (single declaration) regressions.
>> * Rename {arch->x86}-generic-hweightl.{S->c}
>> * Adjust ASM formating
>>
>> The __constructor trick to cause any reference to $foo() to pull in
>> test_$foo() only works when both are in the same TU. i.e. what I did in
>> v1 (putting test_arch_generic_hweightl() in the regular generic-hweightl.c)
>> didn't work.
> I'm afraid I don't understand this. What exactly didn't work, breaking in
> which
> way? Presumably as much as you, I don't really like the global asm() in a C
> file, when ideally the same could be written with less clutter in an assembly
> one.
We have lib-y because we wish to not include arch_generic_hweightl() if
it isn't referenced.
But, test_arch_generic_hweightl() unconditionally references
arch_generic_hweightl() (in CONFIG_SELF_TESTS builds).
So, the trick is to have both {test_,}arch_generic_hweightl() together
in the same object file, with test_* being entirely self contained as a
constructor.
That way, if and only if something else references
arch_generic_hweightl() do we pull in this object file, and pick up the
tests as side effect.
v1 of this patch had the test in a separate object file, causing
arch_generic_hweightl() to be referenced based on the inclusion criteria
for regular generic-hweightl.c
This patch causes the x86 build of Xen to no longer reference
generic_hweightl(), so it was excluded along with its own tests, and
test_arch_generic_hweightl().
Therefore, we had arch_generic_hweightl() in use, but the selftests not
included.
Both {test_,}arch_generic_hweightl() do need to be in the same TU for
this to work (nicely), and I'm very uninterested in writing
test_arch_generic_hweightl() in asm.
>
>> This in turn means that arch_generic_hweightl() needs writing in a global asm
>> block, and also that we can't use FUNC()/END(). While we could adjust it to
>> work for GCC/binutils, we can't have CPP macros in Clang-IAS strings.
> What does Clang different from gcc there? I was hoping that at least their
> pre-
> processors would work in (sufficiently) similar ways.
It's inside a string, not outside, so CPP on the C file does nothing.
Passing CPP over the intermediate .S would work, but we don't have an
intermediate .S with Clang-IAS.
I'm also not particularly interested in doubling up xen/linkage.h with
different quoting in !__ASSEMBLY__ case.
One other option which comes to mind is to have:
.macro func name
FUNC(\name)
.endm
which (I think) could be used as:
asm (
"FUNC foo\n\t"
...
"ret\n\t"
"END foo"
);
It reads almost the same, and avoids opencoding contents of FUNC, but
even this requires changing the __ASSEMBLY__-ness of linkage.h and I
can't see a nice way of making it happen.
Or we finally bump our minimum toolchain version... GCC 7 is already
old enough to be out of the recent LTS's...
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |