[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/boot: Clean up the trampoline transition into Long mode



On 03.01.2020 15:25, Andrew Cooper wrote:
> On 03/01/2020 13:52, Jan Beulich wrote:
>> On 03.01.2020 14:44, Andrew Cooper wrote:
>>> On 03/01/2020 13:36, Jan Beulich wrote:
>>>> On 02.01.2020 15:59, Andrew Cooper wrote:
>>>>> @@ -111,26 +109,6 @@ trampoline_protmode_entry:
>>>>>  start64:
>>>>>          /* Jump to high mappings. */
>>>>>          movabs  $__high_start, %rdi
>>>>> -
>>>>> -#ifdef CONFIG_INDIRECT_THUNK
>>>>> -        /*
>>>>> -         * If booting virtualised, or hot-onlining a CPU, sibling 
>>>>> threads can
>>>>> -         * attempt Branch Target Injection against this jmp.
>>>>> -         *
>>>>> -         * We've got no usable stack so can't use a RETPOLINE thunk, and 
>>>>> are
>>>>> -         * further than disp32 from the high mappings so couldn't use
>>>>> -         * JUMP_THUNK even if it was a non-RETPOLINE thunk.  
>>>>> Furthermore, an
>>>>> -         * LFENCE isn't necessarily safe to use at this point.
>>>>> -         *
>>>>> -         * As this isn't a hotpath, use a fully serialising event to 
>>>>> reduce
>>>>> -         * the speculation window as much as possible.  %ebx needs 
>>>>> preserving
>>>>> -         * for __high_start.
>>>>> -         */
>>>>> -        mov     %ebx, %esi
>>>>> -        cpuid
>>>>> -        mov     %esi, %ebx
>>>>> -#endif
>>>>> -
>>>>>          jmpq    *%rdi
>>>> I can see this being unneeded when running virtualized, as you said
>>>> in reply to Wei. However, for hot-onlining (when other CPUs may run
>>>> random vCPU-s) I don't see how this can safely be dropped. There's
>>>> no similar concern for S3 resume, as thaw_domains() happens only
>>>> after enable_nonboot_cpus().
>>> I covered that in the same reply.  Any guest which can use branch target
>>> injection against this jmp can also poison the regular branch predictor
>>> and get at data that way.
>> Aren't you implying then that retpolines could also be dropped?
> 
> No.  It is a simple risk vs complexity tradeoff.
> 
> Guests running on a sibling *can already* attack this branch with BTI,
> because CPUID isn't a fix to bad BTB speculation, and the leakage gadget
> need only be a single instruction.
> 
> Such a guest can also attack Xen in general with Spectre v1.
> 
> As I said - this was introduced because of paranoia, back while the few
> people who knew about the issues (only several hundred at the time) were
> attempting to figure out what exactly a speculative attack looked like,
> and was applying duct tape to everything suspicious because we had 0
> time to rewrite several core pieces of system handling.

Well, okay then:
Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

>>> Once again, we get to CPU Hotplug being an unused feature in practice,
>>> which is completely evident now with Intel MCE behaviour.
>> What does Intel's MCE behavior have to do with whether CPU hotplug
>> (or hot-onlining) is (un)used in practice?
> 
> The logical consequence of hotplug breaking MCEs.
> 
> If hotplug had been used in practice, the MCE behaviour would have come
> to light much sooner, when MCEs didn't work in practice.
> 
> Given that MCEs really did work in practice even before the L1TF days,
> hotplug wasn't in common-enough use for anyone to notice the MCE behaviour.

Or systems where CPU hotplug was actually used on were of good
enough quality to never surface #MC (personally I don't think
I've seen more than a handful of non-reproducible #MC instances)?
Or people having run into the bad behavior simply didn't have the
resources to investigate why their system shut down silently
(perhaps giving entirely random appearance of the behavior)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.