|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v7 4/7] x86: re-work memcpy()
Move the function to its own assembly file. Having it in C just for the
entire body to be an asm() isn't really helpful. Then have two flavors:
A "basic" version using qword steps for the bulk of the operation, and an
ERMS version for modern hardware, to be substituted in via alternatives
patching.
Alternatives patching, however, requires an extra precaution: It uses
memcpy() itself, and hence the function may patch itself. Luckily the
patched-in code only replaces the prolog of the original function. Make
sure this remains this way.
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Reviewed-by: Teddy Astie <teddy.astie@xxxxxxxxxx>
Reviewed-by: Jason Andryuk <jason.andryuk@xxxxxxx>
---
We may want to consider branching over the REP MOVSQ as well, if the
number of qwords turns out to be zero.
We may also want to consider using non-REP MOVS{L,W,B} for the tail.
TBD: We may further need a workaround similar to Linux'es 8ca97812c3c8
("x86/mce: Work around an erratum on fast string copy
instructions").
TBD: Some older AMD CPUs have an issue with REP MOVS when source and
destination are misaligned with one another (modulo 32?), which may
require a separate memcpy() flavor.
---
v6: Move alternatives.c change to a separate patch.
v5: Re-base.
v4: Use CR2 write as serializing insn, and limit its use to boot time.
v3: Re-base.
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -47,6 +47,7 @@ obj-$(CONFIG_RETURN_THUNK) += indirect-t
obj-$(CONFIG_PV) += ioport_emulate.o
obj-y += irq.o
obj-$(CONFIG_KEXEC) += machine_kexec.o
+obj-y += memcpy.o
obj-y += memset.o
obj-y += mm.o x86_64/mm.o
obj-$(CONFIG_VM_EVENT) += monitor.o
--- /dev/null
+++ b/xen/arch/x86/memcpy.S
@@ -0,0 +1,20 @@
+#include <asm/asm_defns.h>
+
+FUNC(memcpy)
+ mov %rdx, %rcx
+ mov %rdi, %rax
+ /*
+ * We need to be careful here: memcpy() is involved in alternatives
+ * patching, so the code doing the actual copying (i.e. past setting
+ * up registers) may not be subject to patching (unless further
+ * precautions were taken).
+ */
+ ALTERNATIVE "and $7, %edx; shr $3, %rcx", \
+ STR(rep movsb; RET), X86_FEATURE_ERMS
+ rep movsq
+ or %edx, %ecx
+ jz 1f
+ rep movsb
+1:
+ RET
+END(memcpy)
--- a/xen/arch/x86/string.c
+++ b/xen/arch/x86/string.c
@@ -7,21 +7,6 @@
#include <xen/lib.h>
-void *(memcpy)(void *dest, const void *src, size_t n)
-{
- long d0, d1, d2;
-
- asm volatile (
- " rep ; movs"__OS" ; "
- " mov %k4,%k3 ; "
- " rep ; movsb "
- : "=&c" (d0), "=&D" (d1), "=&S" (d2)
- : "0" (n/BYTES_PER_LONG), "r" (n%BYTES_PER_LONG), "1" (dest), "2" (src)
- : "memory" );
-
- return dest;
-}
-
void *(memmove)(void *dest, const void *src, size_t n)
{
long d0, d1, d2;
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |