|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2-ish] x86/boot: Factor move_xen() out of __start_xen()
On 04.01.2023 21:04, Andrew Cooper wrote:
> On 21/12/2022 7:40 am, Jan Beulich wrote:
>> On 20.12.2022 21:56, Andrew Cooper wrote:
>>> On 20/12/2022 1:51 pm, Jan Beulich wrote:
>>>> On 16.12.2022 21:17, Andrew Cooper wrote:
>>>>> + "mov %[cr4], %%cr4\n\t" /* CR4.PGE = 1 */
>>>>> + : [cr4] "=&a" (tmp) /* Could be "r", but "a" makes better asm */
>>>>> + : [cr3] "r" (__pa(idle_pg_table)),
>>>>> + [pge] "i" (X86_CR4_PGE)
>>>>> + : "memory" );
>>>> The removed stack copying worries me, to be honest. Yes, for local
>>>> variables of ours it doesn't matter because they are about to go out
>>>> of scope. But what if the compiler instantiates any for its own use,
>>>> e.g. ...
>>>>
>>>>> + /*
>>>>> + * End of the critical region. Updates to locals and globals now
>>>>> work as
>>>>> + * expected.
>>>>> + *
>>>>> + * Updates to local variables which have been spilled to the stack
>>>>> since
>>>>> + * the memcpy() have been lost. But we don't care, because they're
>>>>> all
>>>>> + * going out of scope imminently.
>>>>> + */
>>>>> +
>>>>> + printk("New Xen image base address: %#lx\n", xen_phys_start);
>>>> ... the result of the address calculation for the string literal
>>>> here? Such auxiliary calculations can happen at any point in the
>>>> function, and won't necessarily (hence the example chosen) become
>>>> impossible for the compiler to do with the memory clobber in the
>>>> asm(). And while the string literal's address is likely cheap
>>>> enough to calculate right in the function invocation sequence (and
>>>> an option would also be to retain the printk() in the caller),
>>>> other instrumentation options could be undermined by this as well.
>>> Right now, the compiler is free to calculate the address of the string
>>> literal in a register, and move it the other side of the TLB flush.
>>> This will work just fine.
>>>
>>> However, the compiler cannot now, or ever in the future, spill such a
>>> calculation to the stack.
>> I'm afraid the compiler's view at things is different: Whatever it puts
>> on the stack is viewed as virtual registers, unaffected by a memory
>> clobber (of course there can be effects resulting from uses of named
>> variables). Look at -O3 output of gcc12 (which is what I happened to
>> play with; I don't think it's overly version dependent) for this
>> (clearly contrived) piece of code:
>>
>> int __attribute__((const)) calc(int);
>>
>> int test(int i) {
>> int j = calc(i);
>>
>> asm("nopl %0" : "+m" (j));
>> asm("nopq %%rsp" ::: "memory", "ax", "cx", "dx", "bx", "bp", "si", "di",
>> "r8", "r9", "r10", "r11", "r12", "r13", "r14",
>> "r15");
>> j = calc(i);
>> asm("nopl %0" :: "m" (j));
>>
>> return j;
>> }
>>
>> It instantiates a local on the stack for the result of calc(); it does
>> not re-invoke calc() a 2nd time. Which means the memory clobber in the
>> middle asm() does not affect that (and by implication: any) stack slot.
>>
>> Using godbolt I can also see that clang15 agrees with gcc12 in this
>> regard. I didn't bother trying other versions.
>
> Well this is problematic, because it contradicts what we depend on
> asm("":::"memory") doing...
Does it? I'm unaware of instances where we use "memory" to deal with
local variables.
> https://godbolt.org/z/xeGMc3sM9
>
> But I don't fully agree with the conclusions drawn by this example.
>
> It only instantiates a local on the stack because you force a memory
> operand to satisfy the "m" constraints, not to satisfy the "memory"
> constraint.
Sure, all I wanted to show is that the compiler may make such
transformations. As said, the example is clearly contrived, but I
also didn't want to spend too much time on finding a yet better one.
> By declaring calc as const, you are permitting the compiler to make an
> explicit transformation to delete one of the calls, irrespective of
> anything else in the function.
>
> It is weird that 'j' ends up taking two stack slots when would be
> absolutely fine for it to only have 1, and indeed this is what happens
> when you remove the first and third asm()'s. It is these which force
> 'j' to be on the stack, not the memory clobber in the middle.
>
> Observe that after commenting those two out, Clang transforms things to
> spill 'i' onto the stack, rather than 'j', and then tail-call calc() on
> the way out. This is actually deleting the first calc() call, rather
> than the second.
Which would still demonstrate what I wanted to demonstrate - we can't
assume that "memory" guards against the use of stack locals by the
compiler.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |