[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm32: Get rid of __memzero()



Hi Jan,

On 27/11/2024 11:09, Jan Beulich wrote:
On 27.11.2024 11:55, Julien Grall wrote:
From: Julien Grall <jgrall@xxxxxxxxxx>

All the code in arch/arm32/lib/ where copied from Linux 3.16
and never re-synced since then.

A few years ago, Linux got rid of __memzero() because the implementation
is very similar to memset(p,0,n) and the current use of __memzero()
interferes with optimization. See full commit message from Linux below.

So it makes sense to get rid of __memzero in Xen as well.

     From ff5fdafc9e9702846480e0cea55ba861f72140a2 Mon Sep 17 00:00:00 2001
     From: Nicolas Pitre <nicolas.pitre@xxxxxxxxxx>
     Date: Fri, 19 Jan 2018 18:17:46 +0100
     Subject: [PATCH] ARM: 8745/1: get rid of __memzero()

     The __memzero assembly code is almost identical to memset's except for
     two orr instructions. The runtime performance of __memset(p, n) and
     memset(p, 0, n) is accordingly almost identical.

     However, the memset() macro used to guard against a zero length and to
     call __memzero at compile time when the fill value is a constant zero
     interferes with compiler optimizations.

     Arnd found tha the test against a zero length brings up some new
     warnings with gcc v8:

       https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82103

     And successively rremoving the test against a zero length and the call
     to __memzero optimization produces the following kernel sizes for
     defconfig with gcc 6:

         text     data     bss       dec       hex  filename
     12248142  6278960  413588  18940690   1210312  vmlinux.orig
     12244474  6278960  413588  18937022   120f4be  vmlinux.no_zero_test
     12239160  6278960  413588  18931708   120dffc  vmlinux.no_memzero

     So it is probably not worth keeping __memzero around given that the
     compiler can do a better job at inlining trivial memset(p,0,n) on its
     own. And the memset code already handles a zero length just fine.

     Suggested-by: Arnd Bergmann <arnd@xxxxxxxx>
     Signed-off-by: Nicolas Pitre <nico@xxxxxxxxxx>
     Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
     Acked-by: Arnd Bergmann <arnd@xxxxxxxx>
     Signed-off-by: Russell King <rmk+kernel@xxxxxxxxxxxxxxx>

Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
ff5fdafc9e97
Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>

Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Thanks!

with a suggestion:

--- a/xen/arch/arm/README.LinuxPrimitives
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -108,10 +108,9 @@ linux/arch/arm/lib/memchr.S             
xen/arch/arm/arm32/lib/memchr.S
  linux/arch/arm/lib/memcpy.S             xen/arch/arm/arm32/lib/memcpy.S
  linux/arch/arm/lib/memmove.S            xen/arch/arm/arm32/lib/memmove.S
  linux/arch/arm/lib/memset.S             xen/arch/arm/arm32/lib/memset.S
-linux/arch/arm/lib/memzero.S            xen/arch/arm/arm32/lib/memzero.S
for i in copy_template.S memchr.S memcpy.S memmove.S memset.S \
-         memzero.S ; do
+        ; do
      diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
  done

Also do away with the line continuation at the same time? E.g.

for i in copy_template.S memchr.S memcpy.S memmove.S memset.S; do
     diff -u linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i
done

I will go with this version because I don't expect the number of files to increase.

Cheers,

--
Julien Grall




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.