[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH 3/9] xen/bitops: Convert 'hweight(x) > 1' to new multiple_bits_set()
Using hweight() is an especially expensive way of determining simply if multiple bits are set in a value. Worse, 4 of the 10 hweight() calls in Xen are of this form. Switch to the new multiple_bits_set() helper. This is far more efficient than the longhand hweight() algorithm and, owing to its simplicity, likely more efficient than even a dedicated instruction on a superscalar processor. No functional change. Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> --- CC: Jan Beulich <JBeulich@xxxxxxxx> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> CC: Wei Liu <wl@xxxxxxx> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx> CC: Julien Grall <julien@xxxxxxx> CC: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx> CC: Bertrand Marquis <bertrand.marquis@xxxxxxx> CC: Michal Orzel <michal.orzel@xxxxxxx> CC: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx> CC: Shawn Anastasio <sanastasio@xxxxxxxxxxxxxxxxxxxxx> On x86, the code reduction speaks for itself: add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-240 (-240) Function old new delta vlapic_ipi 722 650 -72 numa_emulation 577 497 -80 do_xenpmu_op 1665 1577 -88 That's an aweful lot of wasted calculation for the same answer. I can't find a way of enabling CONFIG_NUMA on ARM (yet?) so right now there's no change in any other architecture. However, a synthetic helper shows the following on ARM32: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-128 (-128) Function old new delta test_mbs 176 48 -128 and on ARM64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-44 (-44) Function old new delta test_mbs 60 16 -44 PPC64 has POPCNTD in a default build of Xen and by chance both algorithms compile to the same number of instructions. RISC-V doesn't have hweight() wired up yet at all. --- xen/arch/x86/cpu/vpmu.c | 2 +- xen/arch/x86/hvm/vlapic.c | 10 ++++++---- xen/common/numa.c | 2 +- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c index b2ba9994129b..a5bb1689c7d5 100644 --- a/xen/arch/x86/cpu/vpmu.c +++ b/xen/arch/x86/cpu/vpmu.c @@ -673,7 +673,7 @@ long do_xenpmu_op( { if ( (pmu_params.val & ~(XENPMU_MODE_SELF | XENPMU_MODE_HV | XENPMU_MODE_ALL)) || - (hweight64(pmu_params.val) > 1) ) + multiple_bits_set(pmu_params.val) ) return -EINVAL; /* 32-bit dom0 can only sample itself. */ diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 2ec95942713e..4a3e21a65f9d 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -467,12 +467,14 @@ static bool is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand, return short_hand != APIC_DEST_SELF; if ( vlapic_x2apic_mode(vlapic) ) - return dest_mode ? hweight16(dest) > 1 : dest == 0xffffffffU; + return dest_mode ? multiple_bits_set((uint16_t)dest) + : dest == 0xffffffffU; if ( dest_mode ) - return hweight8(dest & - GET_xAPIC_DEST_FIELD(vlapic_get_reg(vlapic, - APIC_DFR))) > 1; + { + dest &= GET_xAPIC_DEST_FIELD(vlapic_get_reg(vlapic, APIC_DFR)); + return multiple_bits_set((uint8_t)dest); + } return dest == 0xff; } diff --git a/xen/common/numa.c b/xen/common/numa.c index 28a09766fabc..ce3991929ce5 100644 --- a/xen/common/numa.c +++ b/xen/common/numa.c @@ -546,7 +546,7 @@ static int __init numa_emulation(unsigned long start_pfn, uint64_t sz = pfn_to_paddr(end_pfn - start_pfn) / numa_fake; /* Kludge needed for the hash function */ - if ( hweight64(sz) > 1 ) + if ( multiple_bits_set(sz) ) { uint64_t x = 1; -- 2.39.2
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |