[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2 of 2 V3] x86/kexec: Change NMI and MCE handling on kexec path



On 07/12/2012 11:52, Jan Beulich wrote:
>>>> On 06.12.12 at 22:42, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> +
>> +    /* Change NMI trap handlers.  Non-crashing pcpus get nmi_crash which
>> +     * invokes do_nmi_crash (above), which cause them to write state and
>> +     * fall into a loop.  The crashing pcpu gets the nop handler to
>> +     * cause it to return to this function ASAP.
>> +     */
>> +    for ( i = 0; i < nr_cpu_ids; ++i )
>> +        if ( idt_tables[i] )
>> +        {
>> +
>> +            if ( i == cpu )
>> +            {
>> +                /* Disable the interrupt stack tables for this MCE and
>> +                 * NMI handler (shortly to become a nop) as there is a 1
>> +                 * instruction race window where NMIs could be
>> +                 * re-enabled and corrupt the exception frame, leaving
>> +                 * us unable to continue on this crash path (which half
>> +                 * defeats the point of using the nop handler in the
>> +                 * first place).
>> +                 *
>> +                 * This update is safe from a security point of view, as
>> +                 * this pcpu is never going to try to sysret back to a
>> +                 * PV vcpu.
>> +                 */
>> +                set_ist(&idt_tables[i][TRAP_nmi],           IST_NONE);
>> +                set_ist(&idt_tables[i][TRAP_machine_check], IST_NONE);
>> +
>> +                _set_gate(&idt_tables[i][TRAP_nmi], 14, 0, &trap_nop);
> This makes the first set_ist() above pointless, doesn't it?

No.  The set_ist() is to remove the possibility of stack corruption from
reentrant NMIs, while the trap_nop handler is so we don't get diverted
into the regular NMI handler.  There is nothing the NMI handler would do
which could alter the outcome, and there are many cases where the
regular NMI handler would try to panic, starting us reentrantly on the
kexec path again (where we would deadlock on the one_cpu_only() check).

Given that we are going to crash, any time spent in the NMI handler is
time not spent trying to kill the other pcpus, which themselves might be
heading towards a cascade failure.

>
>> +            }
>> +            else
>> +                _set_gate(&idt_tables[i][TRAP_nmi], 14, 0, &nmi_crash);
>> +        }
>> +
>>      /* Ensure the new callback function is set before sending out the NMI. 
>> */
>>      wmb();
>>  ...
>> +/* Enable NMIs.  No special register assumptions, and all preserved. */
>> +ENTRY(enable_nmis)
>> +        pushq %rax
> What's the point of saving %rax here, btw?
>
> Jan

Because at the moment I believe I might need to call it from asm
context, when doing some of the later fixes.  I figured that it was
better to make it safe now, rather than patch it up later.

~Andrew

>
>> +        movq  %rsp, %rax /* Grab RSP before pushing */
>> +
>> +        /* Set up stack frame */
>> +        pushq $0               /* SS */
>> +        pushq %rax             /* RSP */
>> +        pushfq                 /* RFLAGS */
>> +        pushq $__HYPERVISOR_CS /* CS */
>> +        leaq  1f(%rip),%rax
>> +        pushq %rax             /* RIP */
>> +
>> +/* No op trap handler.  Required for kexec crash path.
>> + * It is not used in performance critical code, and saves having a single
>> + * lone aligned iret. It does not use ENTRY to declare the symbol to avoid 
>> the
>> + * explicit alignment. */
>> +.globl trap_nop;
>> +trap_nop:
>> +
>> +        iretq /* Disable the hardware NMI latch */
>> +1:
>> +        popq %rax
>> +        retq
>> +
>>  .section .rodata, "a", @progbits
>>  
>>  ENTRY(exception_table)
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.