[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [V5] x86/xsaves: fix overwriting between non-lazy/lazy xsaves



>>> On 18.03.16 at 04:01, <shuai.ruan@xxxxxxxxxxxxxxx> wrote:
> v5: Address comments from Jan
> 1. Add XSTATE_XSAVES_ONLY and using xsaves depend on whether this bits are
>    set in xcr0_accum
> 2. Change compress logic in compress_xsave_states() depend on 
>    !(v->arch.xcr0_accum & XSTATE_XSAVES_ONLY) && !xsave_area_compressed(src)).
> 3. XSTATE_COMPACTION_ENABLED only set in xrstor().
> 4. Rebase the code on
>    [V4] x86/xsaves: calculate the xstate_comp_offsets base on xstate_bv
>    (already sent out) For they both change same code. 
>    (I am not sure whether this rebase is ok or not).

Such re-basing is okay, but the dependency on the other patch
shouldn't get hidden in the revision log. Best would really be if this
was a series (consisting of both patches).

> @@ -222,22 +222,21 @@ void compress_xsave_states(struct vcpu *v, const void 
> *src, unsigned int size)
>      u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv;
>      u64 valid;
>  
> -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
> +    if ( !(v->arch.xcr0_accum & XSTATE_XSAVES_ONLY) &&
> +         !xsave_area_compressed(src) )
>      {
>          memcpy(xsave, src, size);
>          return;
>      }
>  
> -    ASSERT(!xsave_area_compressed(src));

This is bogus: The function here gets called only after
validate_xstate() already succeeded. Hence the ASSERT()
should imo simply get moved ahead of the if().

>      /*
>       * Copy legacy XSAVE area, to avoid complications with CPUID
>       * leaves 0 and 1 in the loop below.
>       */
>      memcpy(xsave, src, FXSAVE_SIZE);
>  
> -    /* Set XSTATE_BV and XCOMP_BV.  */
> +    /* Set XSTATE_BV.  */
>      xsave->xsave_hdr.xstate_bv = xstate_bv;
> -    xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | 
> XSTATE_COMPACTION_ENABLED;
>      setup_xstate_comp(xstate_comp_offsets, xstate_bv);

I see you set xcomp_bv (and hence the compaction bit) in xrstor()
now, but afaict that doesn't allow you to completely drop initializing
the field here, as the code there looks at the compaction bit.

> @@ -267,31 +266,35 @@ void xsave(struct vcpu *v, uint64_t mask)
>      uint32_t hmask = mask >> 32;
>      uint32_t lmask = mask;
>      unsigned int fip_width = v->domain->arch.x87_fip_width;
> -#define XSAVE(pfx) \
> -        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
> -                         ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
> -                         X86_FEATURE_XSAVEOPT, \
> -                         ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \
> -                         X86_FEATURE_XSAVEC, \
> -                         ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \
> -                         X86_FEATURE_XSAVES, \
> -                         "=m" (*ptr), \
> -                         "a" (lmask), "d" (hmask), "D" (ptr))
> +#define XSAVE(pfx, xsave_ins) \
> +        asm volatile ( ".byte " pfx xsave_ins \
> +                       : "=m" (*ptr) \
> +                       : "a" (lmask), "d" (hmask), "D" (ptr) )
>  
>      if ( fip_width == 8 || !(mask & XSTATE_FP) )
>      {
> -        XSAVE("0x48,");
> +        if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY )
> +            XSAVE("0x48,", "0x0f,0xc7,0x2f"); /* xsaves */
> +        else if ( cpu_has_xsaveopt )
> +            XSAVE("0x48,", "0x0f,0xae,0x37"); /* xsaveopt */
> +        else
> +            XSAVE("0x48,", "0x0f,0xae,0x27"); /* xsave */

The latter two should still use alternative insn patching.

>      }
>      else if ( fip_width == 4 )
>      {
> -        XSAVE("");
> +        if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY )
> +            XSAVE("", "0x0f,0xc7,0x2f");
> +        else if ( cpu_has_xsaveopt )
> +            XSAVE("", "0x0f,0xae,0x37");
> +        else
> +            XSAVE("", "0x0f,0xae,0x27");

And this logic being repeated here (and another time below) should
have made obvious that you want to leave the code here untouched
and do all of your changes just to the XSAVE() macro definition.

> @@ -378,25 +387,42 @@ void xrstor(struct vcpu *v, uint64_t mask)
>          switch ( __builtin_expect(ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET], 8) )
>          {
>              BUILD_BUG_ON(sizeof(faults) != 4); /* Clang doesn't support %z 
> in asm. */
> -#define XRSTOR(pfx) \
> -        alternative_io("1: .byte " pfx "0x0f,0xae,0x2f\n" \
> +#define XRSTOR(pfx, xrstor_ins) \
> +        asm volatile ( "1: .byte " pfx xrstor_ins"\n" \
>                         "3:\n" \
>                         "   .section .fixup,\"ax\"\n" \
>                         "2: incl %[faults]\n" \
>                         "   jmp 3b\n" \
>                         "   .previous\n" \
> -                       _ASM_EXTABLE(1b, 2b), \
> -                       ".byte " pfx "0x0f,0xc7,0x1f\n", \
> -                       X86_FEATURE_XSAVES, \
> -                       ASM_OUTPUT2([mem] "+m" (*ptr), [faults] "+g" 
> (faults)), \
> -                       [lmask] "a" (lmask), [hmask] "d" (hmask), \
> -                       [ptr] "D" (ptr))
> +                       _ASM_EXTABLE(1b, 2b) \
> +                       : [mem] "+m" (*ptr), [faults] "+g" (faults) \
> +                       : [lmask] "a" (lmask), [hmask] "d" (hmask), \
> +                         [ptr] "D" (ptr) )
>  
>          default:
> -            XRSTOR("0x48,");
> +            if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY )
> +            {
> +                if ( unlikely(!(ptr->xsave_hdr.xcomp_bv
> +                                & XSTATE_COMPACTION_ENABLED)) )
> +                    ptr->xsave_hdr.xcomp_bv = ptr->xsave_hdr.xstate_bv
> +                                              | XSTATE_COMPACTION_ENABLED;
> +
> +                XRSTOR("0x48,","0x0f,0xc7,0x1f"); /* xrstors */
> +            }
> +            else
> +                XRSTOR("0x48,","0x0f,0xae,0x2f"); /* xrstor */

At this point, what guarantees that xcomp_bv is zero, no matter
where the state to be loaded originates from? I think at least in
arch_set_info_guest(), hvm_load_cpu_ctxt(), and
hvm_vcpu_reset_state() you went too far deleting code, and you
really need to keep the storing of zero there. Did you draw, just
for yourself, mentally or on a sheet of paper, a diagram illustrating
the various state transitions?

>              break;
>          case 4: case 2:
> -            XRSTOR("");
> +            if ( v->arch.xcr0_accum & XSTATE_XSAVES_ONLY )
> +            {
> +                if ( unlikely(!(ptr->xsave_hdr.xcomp_bv
> +                                & XSTATE_COMPACTION_ENABLED)) )
> +                    ptr->xsave_hdr.xcomp_bv = ptr->xsave_hdr.xstate_bv
> +                                              | XSTATE_COMPACTION_ENABLED;
> +                XRSTOR("","0x0f,0xc7,0x1f");
> +            }
> +            else
> +                XRSTOR("","0x0f,0xae,0x2f");
>              break;

Since again you repeat the same logic twice, this should again have
been a signal that all your changes should go into the XRSTOR()
macro. Or alternatively, since the exception fixup also differs, you
may want to convert the whole logic into an XSAVES and an XSAVE
path. My only really sincere request here is - as little redundancy as
possible, since having to change the same thing twice in more than
one place is always calling for trouble.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.