[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.19] x86/altcall: fix clang code-gen when using altcall in loop constructs
On Wed, Jul 24, 2024 at 08:39:05AM +0200, Jan Beulich wrote: > On 23.07.2024 18:24, Alejandro Vallejo wrote: > > On Tue Jul 23, 2024 at 5:09 PM BST, Roger Pau Monné wrote: > >> On Tue, Jul 23, 2024 at 04:37:12PM +0100, Alejandro Vallejo wrote: > >>> On Tue Jul 23, 2024 at 10:31 AM BST, Roger Pau Monne wrote: > >>>> --- a/xen/arch/x86/include/asm/alternative.h > >>>> +++ b/xen/arch/x86/include/asm/alternative.h > >>>> @@ -185,10 +185,10 @@ extern void alternative_branches(void); > >>>> */ > >>>> #define ALT_CALL_ARG(arg, n) > >>>> \ > >>>> register union { > >>>> \ > >>>> - typeof(arg) e; > >>>> \ > >>>> + typeof(arg) e[sizeof(long) / sizeof(arg)]; > >>>> \ > >>>> unsigned long r; > >>>> \ > >>>> } a ## n ## _ asm ( ALT_CALL_arg ## n ) = { > >>>> \ > >>>> - .e = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); }) > >>>> \ > >>>> + .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); > >>>> })\ > >>>> } > >>>> #else > >>>> #define ALT_CALL_ARG(arg, n) \ > >>> > >>> Don't we want BUILD_BUG_ON(sizeof(long) % sizeof(arg) == 0) instead? > >> > >> I think you meant BUILD_BUG_ON(sizeof(long) % sizeof(arg) != 0)? > > > > Bah, yes. I wrote it as a COMPILE_ASSERT(). > > > >> > >>> Otherwise > >>> odd sizes will cause the wrong union size to prevail, and while I can't > >>> see > >>> today how those might come to happen there's Murphy's law. > >> > >> The overall union size would still be fine, because it has the > >> unsigned long element, it's just that the array won't cover all the > >> space assigned to the long member? > > > > I explained myself poorly. If the current BUILD_BUG_ON() stays as-is that's > > right, but... > > > >> > >> IOW if sizeof(arg) == 7, then we would define an array with only 1 > >> element, which won't make the size of the union change, but won't > >> cover the same space that's used by the long member. > > > > ... I thought the point of the patch was to cover the full union with the > > array, and not just a subset. My proposed alternative merely tries to ensure > > the argument is always a submultiple in size of a long so the array is > > always a > > perfect match. I would be fine with the adjusted BUILD_BUG_ON(), but if we change the instance for clang we should also update the BUILD_BUG_ON() used on the gcc counterpart so they both match. That might be undesirable however since gcc doesn't exhibit any of those bugs, and hence shouldn't have such constraints. > Question is whether there's an issue with odd sized values in Clang. I > wouldn't want to exclude such (admittedly somewhat exotic) uses "just > in case". My understanding here is that the issue the patch addresses > is not merely the treatment of the union by Clang, but the combination > thereof with it violating the psABI when it comes to passing bool > around. So there are at least two issues, one is the lack of truncation of register value done by the callee, which is a psABI violation. The other is clang not resetting the high bits of the register when the altcall is inside a loop construct. In the godbolt example provided the high bits of %rdi are not cleared between loops. This last issue would also be 'fixed' if clang implemented the psABI properly and truncated the register at the callee. I've given a try in godbolt, and odd structure sizes seem to be affected: https://godbolt.org/z/778YsoWsn We have no usages of such structures in Xen so far. The only way I've found to cope with this is to use something like: #define ALT_CALL_ARG(arg, n) \ union { \ typeof(arg) e[(sizeof(long) + sizeof(arg) - 1) / sizeof(arg)]; \ unsigned long r; \ } a ## n ## __ = { \ .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\ }; \ register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = \ a ## n ## __.r An oversized array that ensures all the space of the long is covered by the array, but then we need an extra variable, as we would otherwise require modifying ALT_CALL{0..6}_OUT to use aX_.r instead of aX_. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |