|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 3/7] x86: re-work memcpy()
On 26.11.2024 20:16, Andrew Cooper wrote:
> On 25/11/2024 2:28 pm, Jan Beulich wrote:
>> Move the function to its own assembly file. Having it in C just for the
>> entire body to be an asm() isn't really helpful. Then have two flavors:
>> A "basic" version using qword steps for the bulk of the operation, and an
>> ERMS version for modern hardware, to be substituted in via alternatives
>> patching.
>>
>> Alternatives patching, however, requires an extra precaution: It uses
>> memcpy() itself, and hence the function may patch itself. Luckily the
>> patched-in code only replaces the prolog of the original function. Make
>> sure this remains this way.
>>
>> Additionally alternatives patching, while supposedly safe via enforcing
>> a control flow change when modifying already prefetched code, may not
>> really be. Afaict a request is pending to drop the first of the two
>> options in the SDM's "Handling Self- and Cross-Modifying Code" section.
>> Insert a serializing instruction there.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> We may want to consider branching over the REP MOVSQ as well, if the
>> number of qwords turns out to be zero.
>> We may also want to consider using non-REP MOVS{L,W,B} for the tail.
>
> My feedback for patch 2 is largely applicable here too.
Sure, and I'll apply here whatever we decide to do there.
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -153,12 +153,14 @@ void init_or_livepatch add_nops(void *in
>> * executing.
>> *
>> * "noinline" to cause control flow change and thus invalidate I$ and
>> - * cause refetch after modification.
>> + * cause refetch after modification. While the SDM continues to suggest
>> this
>> + * is sufficient, it may not be - issue a serializing insn afterwards as
>> well.
>
> Did you find a problem in practice, or is this just in case?
It's been too long, so I can now only guess that it's just in case. The
comment change, otoh, suggests otherwise.
> I suspect if you are seeing problems, then it's non-atomicity of the
> stores into memcpy() rather than serialisation.
How would atomicity (or not) matter here? There shouldn't be any difference
between a single and any number of stores into the (previously executed)
insn stream.
>> */
>> static void init_or_livepatch noinline
>> text_poke(void *addr, const void *opcode, size_t len)
>> {
>> memcpy(addr, opcode, len);
>> + cpuid_eax(0);
>
> This whole function is buggy in a couple of ways, starting with the
> comments.
>
> The comment about noinline and control flow changes is only really
> relevant to 32bit processors; we inherited that comment from Linux, and
> they're not applicable to Xen.
>
> AMD64 (both the APM, and SDM) guarantee that Self Modifying Code will be
> dealt with on your behalf, with no serialisation needed.
>
> Cross-modifying code needs far more severe serialisation than given
> here. We get away with it because alternative_{instructions,branches}()
> are pre-SMP, and apply_alternatives() is on livepatches prior to them
> becoming live.
>
>
> I happen to know there's an AMD CPU which has an erratum regarding Self
> Modifying Code and genuinely does need a serialising instruction, but I
> don't know which exact CPU it is.
Maybe I ran into that on one of the two older AMD systems I routinely
test on every once in a while?
> If we're going to put a serialising instruction, it should be a write to
> CR2. We don't care about 486 compatibility, and it's faster than CPUID
> and much much faster if virtualised because it's unlikely to be
> intercepted even under shadow paging.
>
> But, it would be nice not to put serialisation in the general case to
> begin with, especially not into the livepatching case.
If you're aware of an erratum there, how can we get away without any
serialization? I can surely switch to a CR2 write, and I can also make
this dependent upon system_state (thus excluding the LP case).
I notice that arch_livepatch_{apply,revert}() indeed use plain memcpy()
with just the noinline "protection". I wonder how well that works if a
livepatch actually touched the tail of either of these functions
(however unlikely that may be).
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |