|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm: p2m_set_entry duplicate calculation.
Hi, On 26/04/2022 16:37, Paran Lee wrote: Thanks you, I agreed! It made me think once more about what my patch could improve. patches I sent have been reviewed in various ways. It was a good opportunity to analyze my patch from various perspectives. :) I checked objdump in -O2 optimization(default) of Xen Makefile to make sure CSE (Common subexpression elimination) works well on the latest arm64 cross compiler on x86_64 from Arm GNU Toolchain. $ ~/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc -v ... A-profile Architecture 10.3-2021.07 (arm-10.29)' Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.3.1 20210621 (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29) I compared the before and after my patches. This time, without adding a "pages" variable, I proceeded to use the local variable mask with order operation. I was able to confirm that it does one less operation. Well... I don't think the one less operation is because of introduction of the local variable (see more below). 1 << order is a 32-bit value but the second parameter is a 64-bit value (assuming arm64). So... ... this instruction is extending the 32-bit value to 64-bit value. This code is not only using a local variable but also using "1UL". So, I
suspect that if you were using 1 << order, the instruction would re-appear.
Cheers, -- Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |