[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v4 4/6] x86: control memset() and memcpy() inlining
Stop the compiler from inlining non-trivial memset() and memcpy() (for memset() see e.g. map_vcpu_info() or kimage_load_segments() for examples). This way we even keep the compiler from using REP STOSQ / REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is available). With gcc10 this yields a modest .text size reduction (release build) of around 2k. Unfortunately these options aren't understood by the clang versions I have readily available for testing with; I'm unaware of equivalents. Note also that using cc-option-add is not an option here, or at least I couldn't make things work with it (in case the option was not supported by the compiler): The embedded comma in the option looks to be getting in the way. Requested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> --- v3: Re-base. v2: New. --- The boundary values are of course up for discussion - I wasn't really certain whether to use 16 or 32; I'd be less certain about using yet larger values. Similarly whether to permit the compiler to emit REP STOSQ / REP MOVSQ for known size, properly aligned blocks is up for discussion. --- a/xen/arch/x86/arch.mk +++ b/xen/arch/x86/arch.mk @@ -65,6 +65,9 @@ endif $(call cc-option-add,CFLAGS_stack_boundary,CC,-mpreferred-stack-boundary=3) export CFLAGS_stack_boundary +CFLAGS += $(call cc-option,$(CC),-mmemcpy-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign) +CFLAGS += $(call cc-option,$(CC),-mmemset-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign) + ifeq ($(CONFIG_UBSAN),y) # Don't enable alignment sanitisation. x86 has efficient unaligned accesses, # and various things (ACPI tables, hypercall pages, stubs, etc) are wont-fix.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |