[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/3] x86/emulate: add support of emulating SSE2 instruction {, v}movd mm, r32/m32 and {, v}movq mm, r64
On 01/08/16 13:53, Mihai Donțu wrote: > On Monday 01 August 2016 10:52:12 Andrew Cooper wrote: >> On 01/08/16 03:52, Mihai Donțu wrote: >>> Found that Windows driver was using a SSE2 instruction MOVD. >>> >>> Signed-off-by: Zhi Wang <zhi.a.wang@xxxxxxxxx> >>> Signed-off-by: Mihai Donțu <mdontu@xxxxxxxxxxxxxxx> >>> --- >>> Picked from the XenServer 7 patch queue, as suggested by Andrew Cooper >>> >>> Changed since v2: >>> * handle the case where the destination is a GPR >>> --- >>> xen/arch/x86/x86_emulate/x86_emulate.c | 38 >>> +++++++++++++++++++++++++++++++--- >>> 1 file changed, 35 insertions(+), 3 deletions(-) >>> >>> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c >>> b/xen/arch/x86/x86_emulate/x86_emulate.c >>> index 44de3b6..9f89ada 100644 >>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >>> @@ -204,7 +204,7 @@ static uint8_t twobyte_table[256] = { >>> /* 0x60 - 0x6F */ >>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, >>> /* 0x70 - 0x7F */ >>> - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, >>> + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps|ModRM, >>> ImplicitOps|ModRM, >>> /* 0x80 - 0x87 */ >>> ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps, >>> ImplicitOps, ImplicitOps, ImplicitOps, ImplicitOps, >>> @@ -4409,6 +4409,10 @@ x86_emulate( >>> case 0x6f: /* movq mm/m64,mm */ >>> /* {,v}movdq{a,u} xmm/m128,xmm */ >>> /* vmovdq{a,u} ymm/m256,ymm */ >>> + case 0x7e: /* movd mm,r/m32 */ >>> + /* movq mm,r/m64 */ >>> + /* {,v}movd xmm,r/m32 */ >>> + /* {,v}movq xmm,r/m64 */ >>> case 0x7f: /* movq mm,mm/m64 */ >>> /* {,v}movdq{a,u} xmm,xmm/m128 */ >>> /* vmovdq{a,u} ymm,ymm/m256 */ >>> @@ -4432,7 +4436,17 @@ x86_emulate( >>> host_and_vcpu_must_have(sse2); >>> buf[0] = 0x66; /* SSE */ >>> get_fpu(X86EMUL_FPU_xmm, &fic); >>> - ea.bytes = (b == 0xd6 ? 8 : 16); >>> + switch ( b ) >>> + { >>> + case 0x7e: >>> + ea.bytes = 4; >>> + break; >>> + case 0xd6: >>> + ea.bytes = 8; >>> + break; >>> + default: >>> + ea.bytes = 16; >>> + } >>> break; >>> case vex_none: >>> if ( b != 0xe7 ) >>> @@ -4452,7 +4466,17 @@ x86_emulate( >>> ((vex.pfx != vex_66) && (vex.pfx != vex_f3))); >>> host_and_vcpu_must_have(avx); >>> get_fpu(X86EMUL_FPU_ymm, &fic); >>> - ea.bytes = (b == 0xd6 ? 8 : (16 << vex.l)); >>> + switch ( b ) >>> + { >>> + case 0x7e: >>> + ea.bytes = 4; >>> + break; >>> + case 0xd6: >>> + ea.bytes = 8; >>> + break; >>> + default: >>> + ea.bytes = 16 << vex.l; >>> + } >>> } >>> if ( ea.type == OP_MEM ) >>> { >>> @@ -4468,6 +4492,14 @@ x86_emulate( >>> vex.b = 1; >>> buf[4] &= 0x38; >>> } >>> + else if ( b == 0x7e ) >>> + { >>> + /* convert the GPR destination to (%rAX) */ >>> + *((unsigned long *)&mmvalp) = (unsigned long)ea.reg; >>> + rex_prefix &= ~REX_B; >>> + vex.b = 1; >>> + buf[4] &= 0x38; >>> + } >> Thankyou for doing this. However, looking at it, it has some code in >> common with the "ea.type == OP_MEM" clause. >> >> Would this work? >> >> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c >> b/xen/arch/x86/x86_emulate/x86_emulate.c >> index fe594ba..90db067 100644 >> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >> @@ -4453,16 +4453,25 @@ x86_emulate( >> get_fpu(X86EMUL_FPU_ymm, &fic); >> ea.bytes = 16 << vex.l; >> } >> - if ( ea.type == OP_MEM ) >> + if ( ea.type == OP_MEM || ea.type == OP_REG ) >> { >> - /* XXX enable once there is ops->ea() or equivalent >> - generate_exception_if((vex.pfx == vex_66) && >> - (ops->ea(ea.mem.seg, ea.mem.off) >> - & (ea.bytes - 1)), EXC_GP, 0); */ >> - if ( b == 0x6f ) >> - rc = ops->read(ea.mem.seg, ea.mem.off+0, mmvalp, >> - ea.bytes, ctxt); >> /* convert memory operand to (%rAX) */ >> + >> + if ( ea.type == OP_MEM) >> + { >> + /* XXX enable once there is ops->ea() or equivalent >> + generate_exception_if((vex.pfx == vex_66) && >> + (ops->ea(ea.mem.seg, ea.mem.off) >> + & (ea.bytes - 1)), EXC_GP, 0); */ >> + if ( b == 0x6f ) >> + rc = ops->read(ea.mem.seg, ea.mem.off+0, mmvalp, >> + ea.bytes, ctxt); >> + } >> + else if ( ea.type == OP_REG ) >> + { >> + *((unsigned long *)&mmvalp) = (unsigned long)ea.reg; >> + } >> + >> rex_prefix &= ~REX_B; >> vex.b = 1; >> buf[4] &= 0x38; >> >> >> This is untested, but avoids duplicating this bit of state maniupulation. > Your suggestion makes sense, but I'm starting to doubt my initial > patch. :-) I'm testing "movq xmm1, xmm1" and noticing that it takes the > GPR-handling route and I can't seem to be able to easily prevent it > with !(rex_prefix & REX_B), as rex_prefix == 0 and vex.b == 1. I need > to take a harder look at how that class of instructions is coded. > That is a very good point - my suggestion doesn't differentiate GPR register destinations from non-GPR register destinations, and it is only the GPR register destinations we want to turn into memory accesses what about this? else if ( ea.type == OP_REG && b == 0x7e ) { /* GPR register destination - point into the register block. */ *((unsigned long *)&mmvalp) = (unsigned long)ea.reg; } _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |