[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 4/6] x86: control memset() and memcpy() inlining



Le 06/06/2025 à 11:21, Jan Beulich a écrit :
> On 05.06.2025 19:34, Teddy Astie wrote:
>> Le 05/06/2025 à 12:28, Jan Beulich a écrit :
>>> Stop the compiler from inlining non-trivial memset() and memcpy() (for
>>> memset() see e.g. map_vcpu_info() or kimage_load_segments() for
>>> examples). This way we even keep the compiler from using REP STOSQ /
>>> REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is
>>> available).
>>>
>>
>> If the size is known and constant, and the compiler is able to generate
>> a trivial rep movs/stos (usually with a mov $x, %ecx before). I don't
>> see a reason to prevent it or forcing it to make a function call, as I
>> suppose it is very likely that the plain inline rep movs/stos will
>> perform better than a function call (even if it is not the prefered rep
>> movsb/stosb), eventually also being smaller.
>>
>> I wonder if it is possible to only generate inline rep movs/stos for
>> "trivial cases" (i.e preceded with a plain mov $x, %ecx), and rely on
>> either inline movs or function call in other cases (non-trivial ones).
>
> Note how the description starts with "Stop the compiler from inlining
> non-trivial ...", which indeed remain unaffected according to my
> observations (back at the time).

Yes, at least it is what appears to happen when testing using GCC 14.2.0
where non-trivial memset/memcpy are replaced with explicit functions
call, and some trivial ones still use rep movsb/l/q.

Though,

> unrolled_loop:16:noalign,libcall:-1:noalign

to me sounds like :
- use a inline unrolled loop for 0-16 memcpy/memset
- call memset/memcpy for other cases
(thus no rep-prefix shall be used)

The align/noalign meaning being vaguely documented in GCC documentation,
so it's unclear if it only affects "non-aligned" copies, or potentially
all of them.

>
> Jan
>

Teddy


Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.