[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/arm32: Get rid of __memzero()
Hi Jan, On 27/11/2024 11:09, Jan Beulich wrote: On 27.11.2024 11:55, Julien Grall wrote:From: Julien Grall <jgrall@xxxxxxxxxx> All the code in arch/arm32/lib/ where copied from Linux 3.16 and never re-synced since then. A few years ago, Linux got rid of __memzero() because the implementation is very similar to memset(p,0,n) and the current use of __memzero() interferes with optimization. See full commit message from Linux below. So it makes sense to get rid of __memzero in Xen as well. From ff5fdafc9e9702846480e0cea55ba861f72140a2 Mon Sep 17 00:00:00 2001 From: Nicolas Pitre <nicolas.pitre@xxxxxxxxxx> Date: Fri, 19 Jan 2018 18:17:46 +0100 Subject: [PATCH] ARM: 8745/1: get rid of __memzero() The __memzero assembly code is almost identical to memset's except for two orr instructions. The runtime performance of __memset(p, n) and memset(p, 0, n) is accordingly almost identical. However, the memset() macro used to guard against a zero length and to call __memzero at compile time when the fill value is a constant zero interferes with compiler optimizations. Arnd found tha the test against a zero length brings up some new warnings with gcc v8: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82103 And successively rremoving the test against a zero length and the call to __memzero optimization produces the following kernel sizes for defconfig with gcc 6: text data bss dec hex filename 12248142 6278960 413588 18940690 1210312 vmlinux.orig 12244474 6278960 413588 18937022 120f4be vmlinux.no_zero_test 12239160 6278960 413588 18931708 120dffc vmlinux.no_memzero So it is probably not worth keeping __memzero around given that the compiler can do a better job at inlining trivial memset(p,0,n) on its own. And the memset code already handles a zero length just fine. Suggested-by: Arnd Bergmann <arnd@xxxxxxxx> Signed-off-by: Nicolas Pitre <nico@xxxxxxxxxx> Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> Acked-by: Arnd Bergmann <arnd@xxxxxxxx> Signed-off-by: Russell King <rmk+kernel@xxxxxxxxxxxxxxx> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ff5fdafc9e97 Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> Thanks! with a suggestion:--- a/xen/arch/arm/README.LinuxPrimitives +++ b/xen/arch/arm/README.LinuxPrimitives @@ -108,10 +108,9 @@ linux/arch/arm/lib/memchr.S xen/arch/arm/arm32/lib/memchr.S linux/arch/arm/lib/memcpy.S xen/arch/arm/arm32/lib/memcpy.S linux/arch/arm/lib/memmove.S xen/arch/arm/arm32/lib/memmove.S linux/arch/arm/lib/memset.S xen/arch/arm/arm32/lib/memset.S -linux/arch/arm/lib/memzero.S xen/arch/arm/arm32/lib/memzero.Sfor i in copy_template.S memchr.S memcpy.S memmove.S memset.S \- memzero.S ; do + ; do diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i doneAlso do away with the line continuation at the same time? E.g. for i in copy_template.S memchr.S memcpy.S memmove.S memset.S; do diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i done I will go with this version because I don't expect the number of files to increase. Cheers, -- Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |