Xen project Mailing List

Re: [Xen-devel] [PATCH v2 03/11] x86emul: support most memory accessing MMX/SSE/SSE2 insns

To: "xen-devel" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 13 Feb 2017 04:20:06 -0700

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul C Lai <paul.c.lai@xxxxxxxxx>

Delivery-date: Mon, 13 Feb 2017 11:20:34 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 01.02.17 at 12:14, <JBeulich@xxxxxxxx> wrote: > + CASE_SIMD_SCALAR_FP(, 0x0f, 0x2b): /* movnts{s,d} xmm,mem */ > + host_and_vcpu_must_have(sse4a); > + /* fall through */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x2b): /* movntp{s,d} xmm,m128 */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2b): /* vmovntp{s,d} {x,y}mm,mem */ > + generate_exception_if(ea.type != OP_MEM, EXC_UD); > + sfence = true; > + /* fall through */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x10): /* mov{up,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x10): /* vmovup{s,d} > {x,y}mm/mem,{x,y}mm */ > + CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x10): /* vmovs{s,d} mem,xmm */ > + /* vmovs{s,d} xmm,xmm,xmm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x11): /* mov{up,s}{s,d} xmm,xmm/mem */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x11): /* vmovup{s,d} > {x,y}mm,{x,y}mm/mem */ > + CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x11): /* vmovs{s,d} xmm,mem */ > + /* vmovs{s,d} xmm,xmm,xmm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x14): /* unpcklp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x14): /* vunpcklp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x15): /* unpckhp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x15): /* vunpckhp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x28): /* movap{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x28): /* vmovap{s,d} > {x,y}mm/mem,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x29): /* movap{s,d} xmm,xmm/m128 */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x29): /* vmovap{s,d} > {x,y}mm,{x,y}mm/mem */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x51): /* sqrt{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x51): /* vsqrtp{s,d} > {x,y}mm/mem,{x,y}mm */ > + /* vsqrts{s,d} xmm/m32,xmm,xmm */ > + CASE_SIMD_SINGLE_FP(, 0x0f, 0x52): /* rsqrt{p,s}s xmm/mem,xmm */ > + CASE_SIMD_SINGLE_FP(_VEX, 0x0f, 0x52): /* vrsqrtps {x,y}mm/mem,{x,y}mm */ > + /* vrsqrtss xmm/m32,xmm,xmm */ > + CASE_SIMD_SINGLE_FP(, 0x0f, 0x53): /* rcp{p,s}s xmm/mem,xmm */ > + CASE_SIMD_SINGLE_FP(_VEX, 0x0f, 0x53): /* vrcpps {x,y}mm/mem,{x,y}mm */ > + /* vrcpss xmm/m32,xmm,xmm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x54): /* andp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x54): /* vandp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x55): /* andnp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x55): /* vandnp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x56): /* orp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x56): /* vorp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_PACKED_FP(, 0x0f, 0x57): /* xorp{s,d} xmm/m128,xmm */ > + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x57): /* vxorp{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x58): /* add{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x58): /* vadd{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x59): /* mul{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x59): /* vmul{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x5c): /* sub{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5c): /* vsub{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x5d): /* min{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5d): /* vmin{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x5e): /* div{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5e): /* vdiv{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > + CASE_SIMD_ALL_FP(, 0x0f, 0x5f): /* max{p,s}{s,d} xmm/mem,xmm */ > + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5f): /* vmax{p,s}{s,d} > {x,y}mm/mem,{x,y}mm,{x,y}mm */ > if ( vex.opcx == vex_none ) > { > if ( vex.pfx & VEX_PREFIX_DOUBLE_MASK ) > vcpu_must_have(sse2); > else > vcpu_must_have(sse); > - ea.bytes = 16; > - SET_SSE_PREFIX(buf[0], vex.pfx); > get_fpu(X86EMUL_FPU_xmm, &fic); > } > else > { > - fail_if((vex.reg != 0xf) && > - ((ea.type == OP_MEM) || > - !(vex.pfx & VEX_PREFIX_SCALAR_MASK))); > host_and_vcpu_must_have(avx); > + fail_if((vex.pfx & VEX_PREFIX_SCALAR_MASK) && vex.l); While I've changed this to raise #UD in v3, there's a bigger issue here: Over the weekend I've stumbled across https://github.com/intelxed/xed/commit/fb5f8d5aaa2b356bb824e61c666224201c23b984 which raises more questions than it answers: - VCMPSS and VCMPSD are in no way special, i.e. other scalar operations are documented exactly the same way (and while the commit mentions that the SDM is going to be fixed, it is left open how exactly that change is going to look like) - most other scalar instructions in that same file already have no VL128 attribute, yet some VCVTS* ones continue to have despite being no different as per the SDM - VRCPSS and VRSQRTSS are exceptions to the general SDM pattern, in that they are documented LIG - AMD uniformly defines VEX.L to be an ignored bit for scalar operations. Assuming that it'll take a while for Intel to indicate intended behavior here, I tend to think that we should follow AMD's model and ignore VEX.L uniformly, hoping that the specified undefined behavior won't extend beyond undefined changes to the {X,Y,Z}MM register files or raising #UD (against the latter of which we're going to be guarded by the earlier patch adding exception recovery to stub invocations). Thoughts? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.