[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2 of 2 V4] x86/kexec: Change NMI and MCE handling on kexec path



On 11/12/12 15:58, Jan Beulich wrote:
On 11.12.12 at 16:34, Andrew Cooper<andrew.cooper3@xxxxxxxxxx>  wrote:
-    /* Would it be better to replace the trap vector here? */
-    set_nmi_callback(crash_nmi_callback);
+
+    /* Change NMI trap handlers.  Non-crashing pcpus get nmi_crash which
+     * invokes do_nmi_crash (above), which cause them to write state and
+     * fall into a loop.  The crashing pcpu gets the nop handler to
+     * cause it to return to this function ASAP.
+     */
+    for ( i = 0; i<  nr_cpu_ids; ++i )
+        if ( idt_tables[i] )
+        {
+
+            if ( i == cpu )
+            {
+                /* Disable the interrupt stack tables for this MCE and
+                 * NMI handler (shortly to become a nop) as there is a 1
+                 * instruction race window where NMIs could be
+                 * re-enabled and corrupt the exception frame, leaving
+                 * us unable to continue on this crash path (which half
+                 * defeats the point of using the nop handler in the
+                 * first place).
+                 *
+                 * This update is safe from a security point of view, as
+                 * this pcpu is never going to try to sysret back to a
+                 * PV vcpu.
+                 */
This comment appears to have become stale with the latest
changes.

Ok


+                _set_gate_lower(&idt_tables[i][TRAP_nmi], 14, 0,&trap_nop);
No need for the extra&  on functions and arrays.

I was continuing the prevailing style from traps.c Personally, I prefer the & notation for function pointers, to be consistent with regular pointers, even though I am aware that it is not strictly needed. I am not fussed if you wish to insist on one style, but we do have mixed styles across the codebase.


+                set_ist(&idt_tables[i][TRAP_machine_check], IST_NONE);
+            }
+            else
+                /* Do not update stack table for other pcpus. */
+                _update_gate_addr_lower(&idt_tables[i][TRAP_nmi],&nmi_crash);
+        }
+
...
+/* Write the lower 64 bits of an IDT Entry. This relies on the upper 32
+ * bits of the address not changing, which is a safe aumption until our
assumption

+ * code size exceeds 4GB.
1Gb.

+ *
+ * Ideally, we would use cmpxchg16b, but this is not supported on some
+ * old AMD 64bit capable processors, and has no safe equivelent.
equivalent

+ */
+static inline void _write_gate_lower(idt_entry_t * gate, idt_entry_t * new)
static inline void _write_gate_lower(idt_entry_t *gate, idt_entry_t *new)

(similar extra blanks elsewhere)

Also, to make clear which of the two is the entry written, const-
qualifying the other one might be a good idea.

Ok


+{
+    ASSERT(gate->b == new->b);
+    *(volatile unsigned long *)&gate->a = new->a;
volatile? And if so, why not volatilie-qualify the function parameter?

I was looking to avoid having the compiler inline this function and decide that it can merge *gate_addr and idte together, resulting in multiple writes to gate_addr. Without the volatile, the compiler is free to make this optimization, which puts us back with the racy case we are trying to avoid. The reason for avoiding the volatile function parameter is so the assertion equality can be optimized where possible.


+}
+
  #define _set_gate(gate_addr,type,dpl,addr)               \
  do {                                                     \
      (gate_addr)->a = 0;                                  \
@@ -122,6 +135,35 @@ do {
          (1UL<<  47);                                     \
  } while (0)

+#define _set_gate_lower(gate_addr,type,dpl,addr)         \
+    do {                                                 \
+    idt_entry_t idte;                                    \
+    idte.b = (gate_addr)->b;                             \
+    idte.a =                                            \
+        (((unsigned long)(addr)&  0xFFFF0000UL)<<  32) | \
+        ((unsigned long)(dpl)<<  45) |                   \
+        ((unsigned long)(type)<<  40) |                  \
+        ((unsigned long)(addr)&  0xFFFFUL) |             \
+        ((unsigned long)__HYPERVISOR_CS64<<  16) |       \
+        (1UL<<  47);                                     \
+    _write_gate_lower((gate_addr),&idte);               \
No need for extra inner parentheses.

+} while (0)
+
+/* Update the lower half handler of an IDT Entry, without changing any
+ * other configuration. */
+static inline void _update_gate_addr_lower(idt_entry_t * gate, void * addr)
Any reason for this being an inline function and the other being
a macro?

Jan

Where possible, I prefer static inline in preference to macros because of the added type checking etc.

_set_gate_lower is based on _set_gate, so I used the _set_gate style.

Again, I am not overly fussed if you have a strong preference for style. (Both of these styles are mixed across the codebase and have no indication of preference in CODING_STYLE)

~Andrew


+{
+    idt_entry_t idte;
+    idte.a = gate->a;
+
+    idte.b = ((unsigned long)(addr)>>  32);
+    idte.a&= 0x0000FFFFFFFF0000ULL;
+    idte.a |= (((unsigned long)(addr)&  0xFFFF0000UL)<<  32) |
+        ((unsigned long)(addr)&  0xFFFFUL);
+
+    _write_gate_lower(gate,&idte);
+}
+
  #define _set_tssldt_desc(desc,addr,limit,type)           \
  do {                                                     \
      (desc)[0].b = (desc)[1].b = 0;                       \


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.