[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 4/6] x86: control memset() and memcpy() inlining
Le 06/06/2025 à 11:21, Jan Beulich a écrit : > On 05.06.2025 19:34, Teddy Astie wrote: >> Le 05/06/2025 à 12:28, Jan Beulich a écrit : >>> Stop the compiler from inlining non-trivial memset() and memcpy() (for >>> memset() see e.g. map_vcpu_info() or kimage_load_segments() for >>> examples). This way we even keep the compiler from using REP STOSQ / >>> REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is >>> available). >>> >> >> If the size is known and constant, and the compiler is able to generate >> a trivial rep movs/stos (usually with a mov $x, %ecx before). I don't >> see a reason to prevent it or forcing it to make a function call, as I >> suppose it is very likely that the plain inline rep movs/stos will >> perform better than a function call (even if it is not the prefered rep >> movsb/stosb), eventually also being smaller. >> >> I wonder if it is possible to only generate inline rep movs/stos for >> "trivial cases" (i.e preceded with a plain mov $x, %ecx), and rely on >> either inline movs or function call in other cases (non-trivial ones). > > Note how the description starts with "Stop the compiler from inlining > non-trivial ...", which indeed remain unaffected according to my > observations (back at the time). Yes, at least it is what appears to happen when testing using GCC 14.2.0 where non-trivial memset/memcpy are replaced with explicit functions call, and some trivial ones still use rep movsb/l/q. Though, > unrolled_loop:16:noalign,libcall:-1:noalign to me sounds like : - use a inline unrolled loop for 0-16 memcpy/memset - call memset/memcpy for other cases (thus no rep-prefix shall be used) The align/noalign meaning being vaguely documented in GCC documentation, so it's unclear if it only affects "non-aligned" copies, or potentially all of them. > > Jan > Teddy Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |