[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: lib32 followup (visibility issue and/or toolchain bug?)
On Wed, Oct 16, 2024 at 3:53 PM Frediano Ziglio <frediano.ziglio@xxxxxxxxx> wrote: > > On Wed, Oct 16, 2024 at 2:30 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > wrote: > > > > Hello, > > > > Preempting some future work I'm expecting to arrive, I had a go at using > > __builtin_*() in obj32. > > > > This is formed of 2 patches on top of this series: > > https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-lib32 > > > > You are confident we'll have a lot of shared code to need an > additional "lib32" in the Makefile! > I would personally stick with obj32. > Note that file should be strlen.c, not strlen.32.c, otherwise you are > possibly going to pick up the general rule and not the one in the > Makefile (but maybe is what you wanted). > > > Patch 1 introduces lib32 beside obj32, with strlen() being the first > > broken-out function, and patch 2 swaps to __builtin_strlen(). > > > > Both compile, but the difference that patch 2 introduces was unexpected. > > > > With just lib32, and taking strsubcmp() as an example, we get: > > > > 00000000 <strsubcmp>: > > 0: 83 ec 0c sub $0xc,%esp > > 3: 89 5c 24 04 mov %ebx,0x4(%esp) > > 7: 89 74 24 08 mov %esi,0x8(%esp) > > b: 89 c6 mov %eax,%esi > > d: 89 d3 mov %edx,%ebx > > f: 89 d0 mov %edx,%eax > > 11: /-- e8 fc ff ff ff call 12 <strsubcmp+0x12> > > 12: R_386_PC32 strlen > > 16: 89 c1 mov %eax,%ecx > > 18: 89 da mov %ebx,%edx > > 1a: 89 f0 mov %esi,%eax > > 1c: /-- e8 fc ff ff ff call 1d <strsubcmp+0x1d> > > 1d: R_386_PC32 .text.strncmp > > 21: 8b 5c 24 04 mov 0x4(%esp),%ebx > > 25: 8b 74 24 08 mov 0x8(%esp),%esi > > 29: 83 c4 0c add $0xc,%esp > > 2c: c3 ret > > > > which all seems fine. We get a plain PC32 relocation to strlen (which > > is now in the separate library). > > > > However, with patch 2 in place (simply swapping the plain extern for > > __builtin_strlen(), we now get: > > > > 00000000 <strsubcmp>: > > 0: 83 ec 0c sub $0xc,%esp > > 3: 89 1c 24 mov %ebx,(%esp) > > 6: 89 74 24 04 mov %esi,0x4(%esp) > > a: 89 7c 24 08 mov %edi,0x8(%esp) > > e: /-- e8 fc ff ff ff call f <strsubcmp+0xf> > > f: R_386_PC32 __x86.get_pc_thunk.bx > > 13: 81 c3 02 00 00 00 add $0x2,%ebx > > 15: R_386_GOTPC _GLOBAL_OFFSET_TABLE_ > > 19: 89 c7 mov %eax,%edi > > 1b: 89 d6 mov %edx,%esi > > 1d: 89 d0 mov %edx,%eax > > 1f: /-- e8 fc ff ff ff call 20 <strsubcmp+0x20> > > 20: R_386_PLT32 strlen > > PLT means it not declared hidden, otherwise it would have used the > relative relocation. > Maybe > > size_t strlen(const char *s); > #define strlen(s) __builtin_strlen(s) > > xen/compiler.h is included, so all declaration should get the hidden > by default ? Or add __attribute__((visibility("hidden"))) explicitly. > > > 24: 89 c1 mov %eax,%ecx > > 26: 89 f2 mov %esi,%edx > > 28: 89 f8 mov %edi,%eax > > 2a: /-- e8 fc ff ff ff call 2b <strsubcmp+0x2b> > > 2b: R_386_PC32 .text.strncmp > > 2f: 8b 1c 24 mov (%esp),%ebx > > 32: 8b 74 24 04 mov 0x4(%esp),%esi > > 36: 8b 7c 24 08 mov 0x8(%esp),%edi > > 3a: 83 c4 0c add $0xc,%esp > > 3d: c3 ret > > > > > > The builtin hasn't managed to optimise away the call to strlen (that's > > fine). But, we've ended up spilling %ebx to the stack, calculating the > > location of the GOT and not using it. > > > > Maybe the ABI for PLT is to have %ebx set to the GOT ? Not sure about it. > > > So, as it stands, trying to use __builtin_strlen() results in worse code > > generation. One thing I noticed was that we're not passing > > -fvisibility=hidden into CFLAGS_x86_32, but fixing that doesn't help > > either. We do have the pragma from compiler.h, so I'm out of visibility > > ideas. > > > > The -fvisibility=hidden should be overridden by the xen/compiler.h; > but should be overridden with hidden! > Maybe strlen is defined by default with another visibility? > If you generate the assembly, you should see if the strlen symbol gets > the .hidden bless or not. > I did some more experiments, but I didn't manage to fix the issue. It looks like when __builtin_strlen fallback to calling strlen it uses a private declaration of strlen. -mregparam argument is taken into account, but not visibility. I tried to declare strlen as hidden (checking also preprocessor output to see if other declaration were present, none found), still the @PLT. I tried to add a "static inline strlen"... and it was not used. I found this workaround: diff --git a/xen/arch/x86/boot/cmdline.c b/xen/arch/x86/boot/cmdline.c index 80ffd0885e..ac2b0b0a4d 100644 --- a/xen/arch/x86/boot/cmdline.c +++ b/xen/arch/x86/boot/cmdline.c @@ -51,7 +51,12 @@ static const char delim_chars_comma[] = ", \n\r\t"; #define delim_chars (delim_chars_comma + 1) -#define strlen(s) __builtin_strlen(s) +size_t strlen(const char *s); +#define strlen(str) \ + (__extension__ (__builtin_constant_p(str) \ + ? __builtin_strlen(str) \ + : strlen(str))) + static int strncmp(const char *cs, const char *ct, size_t count) { diff --git a/xen/arch/x86/boot/strlen.32.c b/xen/arch/x86/boot/strlen.c similarity index 100% rename from xen/arch/x86/boot/strlen.32.c rename to xen/arch/x86/boot/strlen.c Frediano
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |