[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Minios-devel] [UNIKRAFT PATCH v4 7/7] arch/arm64: Reimplement ukarch_ffsl with gcc builtin
Using builtin is better as you can let the GCC folks to maintain the function. I've compared the performance between original ffsl and gcc builtin on a ThunderX2 host(gcc version 7.4.0), default gcc optimization options. for (x=0; x<0xfffffff; x++) original: real 0m1.723s user 0m1.723s sys 0m0.000s gcc builtin: real 0m1.550s user 0m1.546s sys 0m0.004s Signed-off-by: Jia He <justin.he@xxxxxxx> --- arch/arm/arm64/include/uk/asm/atomic.h | 25 ++----------------------- 1 file changed, 2 insertions(+), 23 deletions(-) diff --git a/arch/arm/arm64/include/uk/asm/atomic.h b/arch/arm/arm64/include/uk/asm/atomic.h index fb7d3bc..029fc12 100644 --- a/arch/arm/arm64/include/uk/asm/atomic.h +++ b/arch/arm/arm64/include/uk/asm/atomic.h @@ -66,30 +66,9 @@ static inline unsigned int ukarch_fls(unsigned int x) * * Undefined if no bit exists, so code should check against 0 first. */ -static inline unsigned long ukarch_ffsl(unsigned long word) +static inline unsigned long ukarch_ffsl(unsigned long x) { - int clz; - - /* xxxxx10000 = word - * xxxxx01111 = word - 1 - * 0000011111 = word ^ (word - 1) - * 4 = 63 - clz(word ^ (word - 1)) - */ - - __asm__("sub x0, %[word], #1\n" - "eor x0, x0, %[word]\n" - "clz %[clz], x0\n" - : - /* Outputs: */ - [clz] "=r"(clz) - : - /* Inputs: */ - [word] "r"(word) - : - /* Clobbers: */ - "x0"); - - return 63 - clz; + return __builtin_ffsl(x); } /** -- 2.17.1 _______________________________________________ Minios-devel mailing list Minios-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/minios-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |