[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/emul: only emulate possibly operand sizes for POPA

>>> On 08.11.12 at 08:48, Keir Fraser <keir.xen@xxxxxxxxx> wrote:
> On 08/11/2012 07:34, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>> Would prefer:
>>>  if ( op_bytes == 2 )
>>>      *(uint16_t *)regs[i] = (uint16_t)dst.val;
>>>  else
>>>      *regs[i] = dst.val;
>>> Handles the exceptional case immediately after its predicate.
>> I had it that way first, but compilers tend to prefer (in terms of
>> static branch prediction) the if() body over the else one. Doesn't
>> matter that much here of course, but I'm generally trying to
>> follow such guidelines even in non performance critical paths so
>> that in case code gets cloned elsewhere it doesn't require extra
>> reviewing or adjustment.
> Should follow such guidelines where the optimisation matters. I think
> shaping code to follow such guidelines all the time, is misguided. I'd
> rather have the fractionally more readable version than the
> possibly-fractionally faster version.

Okay, will adjust accordingly then.

>>> And the cast
>>> from uint32_t, and 64b-related comment, are pointless and in fact misleading
>>> in the default case, since as you say the instruction is invalid in 64-bit
>>> mode.
>> And I considered that aspect too: Even if invalid in 64-bit mode, it
>> is valid in compatibility mode, and in that case the zero-extension
>> makes sense (as does the comment).
> I did wonder. The top halves of 64b registers are not used in compatibility
> mode. Are their contents at all guaranteed to be
> maintained/updated/preserved in any meaningful way across transitions into
> and out of compatibility mode? I wasn't aware they were, and in that case
> the cast and comment are indeed pointless.

There's no architectural guarantee, but that's how CPUs work.
The important aspect (from an information leak perspective) is
that upper halves don't get zeroed explicitly when switching
between compatibility and 64-bit modes.

Now we can of course utilize that read_ulong() already does the
zero extension (but if it didn't, we would leak stack contents here,
so it may still be worth a comment), to the net effect of

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1996,13 +1996,11 @@ x86_emulate(
             if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes),
                                   &dst.val, op_bytes, ctxt, ops)) != 0 )
                 goto done;
-            switch ( op_bytes )
-            {
-            case 1: *(uint8_t  *)regs[i] = (uint8_t)dst.val; break;
-            case 2: *(uint16_t *)regs[i] = (uint16_t)dst.val; break;
-            case 4: *regs[i] = (uint32_t)dst.val; break; /* 64b: zero-ext */
-            case 8: *regs[i] = dst.val; break;
-            }
+            if ( op_bytes == 2 )
+                *(uint16_t *)regs[i] = (uint16_t)dst.val;
+            else
+                /* No need for zero-extension - read_ulong() already did so. */
+                *regs[i] = dst.val;


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.