|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [xen master] x86/bitops: Account for POPCNT errata on earlier Intel CPUs
commit 38adc2d7879c9a68b21a1dddb09d9c9f34d15ee4
Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
AuthorDate: Tue Mar 25 18:02:03 2025 +0000
Commit: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CommitDate: Wed Mar 26 11:54:59 2025 +0000
x86/bitops: Account for POPCNT errata on earlier Intel CPUs
Manually break the false dependency for the benefit of cases such as
bitmap_weight() which is a reasonable hotpath.
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
---
xen/arch/x86/include/asm/bitops.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/include/asm/bitops.h
b/xen/arch/x86/include/asm/bitops.h
index bb9d756460..87eac7782f 100644
--- a/xen/arch/x86/include/asm/bitops.h
+++ b/xen/arch/x86/include/asm/bitops.h
@@ -488,10 +488,16 @@ static always_inline unsigned int arch_hweightl(unsigned
long x)
*
* This limits the POPCNT instruction to using the same ABI as a function
* call (input in %rdi, output in %eax) but that's fine.
+ *
+ * On Intel CPUs prior to Cannon Lake, the POPCNT instruction has a false
+ * input dependency on it's destination register (errata HSD146, SKL029
+ * amongst others), impacting loops such as bitmap_weight(). Insert an
+ * XOR to manually break the dependency.
*/
alternative_io("call arch_generic_hweightl",
+ "xor %k[res], %k[res]\n\t"
"popcnt %[val], %q[res]", X86_FEATURE_POPCNT,
- ASM_OUTPUT2([res] "=a" (r) ASM_CALL_CONSTRAINT),
+ ASM_OUTPUT2([res] "=&a" (r) ASM_CALL_CONSTRAINT),
[val] "D" (x));
return r;
--
generated by git-patchbot for /home/xen/git/xen.git#master
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |