|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 03/11] x86emul: support most memory accessing MMX/SSE/SSE2 insns
>>> On 01.02.17 at 12:14, <JBeulich@xxxxxxxx> wrote:
> + CASE_SIMD_SCALAR_FP(, 0x0f, 0x2b): /* movnts{s,d} xmm,mem */
> + host_and_vcpu_must_have(sse4a);
> + /* fall through */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x2b): /* movntp{s,d} xmm,m128 */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2b): /* vmovntp{s,d} {x,y}mm,mem */
> + generate_exception_if(ea.type != OP_MEM, EXC_UD);
> + sfence = true;
> + /* fall through */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x10): /* mov{up,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x10): /* vmovup{s,d}
> {x,y}mm/mem,{x,y}mm */
> + CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x10): /* vmovs{s,d} mem,xmm */
> + /* vmovs{s,d} xmm,xmm,xmm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x11): /* mov{up,s}{s,d} xmm,xmm/mem */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x11): /* vmovup{s,d}
> {x,y}mm,{x,y}mm/mem */
> + CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x11): /* vmovs{s,d} xmm,mem */
> + /* vmovs{s,d} xmm,xmm,xmm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x14): /* unpcklp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x14): /* vunpcklp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x15): /* unpckhp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x15): /* vunpckhp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x28): /* movap{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x28): /* vmovap{s,d}
> {x,y}mm/mem,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x29): /* movap{s,d} xmm,xmm/m128 */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x29): /* vmovap{s,d}
> {x,y}mm,{x,y}mm/mem */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x51): /* sqrt{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x51): /* vsqrtp{s,d}
> {x,y}mm/mem,{x,y}mm */
> + /* vsqrts{s,d} xmm/m32,xmm,xmm */
> + CASE_SIMD_SINGLE_FP(, 0x0f, 0x52): /* rsqrt{p,s}s xmm/mem,xmm */
> + CASE_SIMD_SINGLE_FP(_VEX, 0x0f, 0x52): /* vrsqrtps {x,y}mm/mem,{x,y}mm */
> + /* vrsqrtss xmm/m32,xmm,xmm */
> + CASE_SIMD_SINGLE_FP(, 0x0f, 0x53): /* rcp{p,s}s xmm/mem,xmm */
> + CASE_SIMD_SINGLE_FP(_VEX, 0x0f, 0x53): /* vrcpps {x,y}mm/mem,{x,y}mm */
> + /* vrcpss xmm/m32,xmm,xmm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x54): /* andp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x54): /* vandp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x55): /* andnp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x55): /* vandnp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x56): /* orp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x56): /* vorp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_PACKED_FP(, 0x0f, 0x57): /* xorp{s,d} xmm/m128,xmm */
> + CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x57): /* vxorp{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x58): /* add{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x58): /* vadd{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x59): /* mul{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x59): /* vmul{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x5c): /* sub{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5c): /* vsub{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x5d): /* min{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5d): /* vmin{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x5e): /* div{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5e): /* vdiv{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> + CASE_SIMD_ALL_FP(, 0x0f, 0x5f): /* max{p,s}{s,d} xmm/mem,xmm */
> + CASE_SIMD_ALL_FP(_VEX, 0x0f, 0x5f): /* vmax{p,s}{s,d}
> {x,y}mm/mem,{x,y}mm,{x,y}mm */
> if ( vex.opcx == vex_none )
> {
> if ( vex.pfx & VEX_PREFIX_DOUBLE_MASK )
> vcpu_must_have(sse2);
> else
> vcpu_must_have(sse);
> - ea.bytes = 16;
> - SET_SSE_PREFIX(buf[0], vex.pfx);
> get_fpu(X86EMUL_FPU_xmm, &fic);
> }
> else
> {
> - fail_if((vex.reg != 0xf) &&
> - ((ea.type == OP_MEM) ||
> - !(vex.pfx & VEX_PREFIX_SCALAR_MASK)));
> host_and_vcpu_must_have(avx);
> + fail_if((vex.pfx & VEX_PREFIX_SCALAR_MASK) && vex.l);
While I've changed this to raise #UD in v3, there's a bigger issue
here: Over the weekend I've stumbled across
https://github.com/intelxed/xed/commit/fb5f8d5aaa2b356bb824e61c666224201c23b984
which raises more questions than it answers:
- VCMPSS and VCMPSD are in no way special, i.e. other scalar
operations are documented exactly the same way (and while
the commit mentions that the SDM is going to be fixed, it is
left open how exactly that change is going to look like)
- most other scalar instructions in that same file already have
no VL128 attribute, yet some VCVTS* ones continue to have
despite being no different as per the SDM
- VRCPSS and VRSQRTSS are exceptions to the general SDM
pattern, in that they are documented LIG
- AMD uniformly defines VEX.L to be an ignored bit for scalar
operations.
Assuming that it'll take a while for Intel to indicate intended
behavior here, I tend to think that we should follow AMD's
model and ignore VEX.L uniformly, hoping that the specified
undefined behavior won't extend beyond undefined changes
to the {X,Y,Z}MM register files or raising #UD (against the
latter of which we're going to be guarded by the earlier
patch adding exception recovery to stub invocations).
Thoughts?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |