[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: lib32 followup (visibility issue and/or toolchain bug?)
On Wed, Oct 16, 2024 at 2:30 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > Hello, > > Preempting some future work I'm expecting to arrive, I had a go at using > __builtin_*() in obj32. > > This is formed of 2 patches on top of this series: > https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-lib32 > You are confident we'll have a lot of shared code to need an additional "lib32" in the Makefile! I would personally stick with obj32. Note that file should be strlen.c, not strlen.32.c, otherwise you are possibly going to pick up the general rule and not the one in the Makefile (but maybe is what you wanted). > Patch 1 introduces lib32 beside obj32, with strlen() being the first > broken-out function, and patch 2 swaps to __builtin_strlen(). > > Both compile, but the difference that patch 2 introduces was unexpected. > > With just lib32, and taking strsubcmp() as an example, we get: > > 00000000 <strsubcmp>: > 0: 83 ec 0c sub $0xc,%esp > 3: 89 5c 24 04 mov %ebx,0x4(%esp) > 7: 89 74 24 08 mov %esi,0x8(%esp) > b: 89 c6 mov %eax,%esi > d: 89 d3 mov %edx,%ebx > f: 89 d0 mov %edx,%eax > 11: /-- e8 fc ff ff ff call 12 <strsubcmp+0x12> > 12: R_386_PC32 strlen > 16: 89 c1 mov %eax,%ecx > 18: 89 da mov %ebx,%edx > 1a: 89 f0 mov %esi,%eax > 1c: /-- e8 fc ff ff ff call 1d <strsubcmp+0x1d> > 1d: R_386_PC32 .text.strncmp > 21: 8b 5c 24 04 mov 0x4(%esp),%ebx > 25: 8b 74 24 08 mov 0x8(%esp),%esi > 29: 83 c4 0c add $0xc,%esp > 2c: c3 ret > > which all seems fine. We get a plain PC32 relocation to strlen (which > is now in the separate library). > > However, with patch 2 in place (simply swapping the plain extern for > __builtin_strlen(), we now get: > > 00000000 <strsubcmp>: > 0: 83 ec 0c sub $0xc,%esp > 3: 89 1c 24 mov %ebx,(%esp) > 6: 89 74 24 04 mov %esi,0x4(%esp) > a: 89 7c 24 08 mov %edi,0x8(%esp) > e: /-- e8 fc ff ff ff call f <strsubcmp+0xf> > f: R_386_PC32 __x86.get_pc_thunk.bx > 13: 81 c3 02 00 00 00 add $0x2,%ebx > 15: R_386_GOTPC _GLOBAL_OFFSET_TABLE_ > 19: 89 c7 mov %eax,%edi > 1b: 89 d6 mov %edx,%esi > 1d: 89 d0 mov %edx,%eax > 1f: /-- e8 fc ff ff ff call 20 <strsubcmp+0x20> > 20: R_386_PLT32 strlen PLT means it not declared hidden, otherwise it would have used the relative relocation. Maybe size_t strlen(const char *s); #define strlen(s) __builtin_strlen(s) xen/compiler.h is included, so all declaration should get the hidden by default ? Or add __attribute__((visibility("hidden"))) explicitly. > 24: 89 c1 mov %eax,%ecx > 26: 89 f2 mov %esi,%edx > 28: 89 f8 mov %edi,%eax > 2a: /-- e8 fc ff ff ff call 2b <strsubcmp+0x2b> > 2b: R_386_PC32 .text.strncmp > 2f: 8b 1c 24 mov (%esp),%ebx > 32: 8b 74 24 04 mov 0x4(%esp),%esi > 36: 8b 7c 24 08 mov 0x8(%esp),%edi > 3a: 83 c4 0c add $0xc,%esp > 3d: c3 ret > > > The builtin hasn't managed to optimise away the call to strlen (that's > fine). But, we've ended up spilling %ebx to the stack, calculating the > location of the GOT and not using it. > Maybe the ABI for PLT is to have %ebx set to the GOT ? Not sure about it. > So, as it stands, trying to use __builtin_strlen() results in worse code > generation. One thing I noticed was that we're not passing > -fvisibility=hidden into CFLAGS_x86_32, but fixing that doesn't help > either. We do have the pragma from compiler.h, so I'm out of visibility > ideas. > The -fvisibility=hidden should be overridden by the xen/compiler.h; but should be overridden with hidden! Maybe strlen is defined by default with another visibility? If you generate the assembly, you should see if the strlen symbol gets the .hidden bless or not. > Anything else I've missed? > Coffee :-) Frediano
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |