[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/5] x86/32on64: use generic instruction decoding for call gate emulation



>>> On 06.12.16 at 12:56, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 06/12/16 11:12, Jan Beulich wrote:
>> +static int gate_op_read(
>> +    enum x86_segment seg,
>> +    unsigned long offset,
>> +    void *p_data,
>> +    unsigned int bytes,
>> +    struct x86_emulate_ctxt *ctxt)
>> +{
>> +    const struct gate_op_ctxt *goc =
>> +        container_of(ctxt, struct gate_op_ctxt, ctxt);
>> +    unsigned int rc = bytes, sel = 0;
>> +    unsigned long addr = offset, limit = 0;
>> +
>> +    switch ( seg )
>> +    {
>> +    case x86_seg_cs:
>> +        addr += goc->cs.base;
>> +        limit = goc->cs.limit;
>> +        break;
>> +    case x86_seg_ds:
>> +        sel = read_sreg(ds);
>> +        break;
>> +    case x86_seg_es:
>> +        sel = read_sreg(es);
>> +        break;
>> +    case x86_seg_fs:
>> +        sel = read_sreg(fs);
>> +        break;
>> +    case x86_seg_gs:
>> +        sel = read_sreg(gs);
>> +        break;
>> +    case x86_seg_ss:
>> +        sel = ctxt->regs->ss;
>> +        break;
>> +    default:
>> +        return X86EMUL_UNHANDLEABLE;
>> +    }
>> +    if ( sel )
>> +    {
>> +        unsigned int ar;
>> +
>> +        ASSERT(!goc->insn_fetch);
>> +        if ( !read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
>> +             !(ar & _SEGMENT_S) ||
>> +             !(ar & _SEGMENT_P) ||
>> +             ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
>> +            return X86EMUL_UNHANDLEABLE;
>> +        addr += offset;
>> +    }
>> +    else if ( seg != x86_seg_cs )
>> +        return X86EMUL_UNHANDLEABLE;
>> +
>> +    if ( limit < bytes - 1 || offset > limit - bytes + 1 )
> 
> Is this correct for the zero-length instruction fetches which the
> emulator emits?  It would be better to make it safe now, than to
> introduce a latent bug in the future.

The left side of the || unconditionally fails such fetches, and that's
precisely what we want here (as tight a condition as possible)
considering that this gets used only for decoding, not for executing
instructions.

>> +        return X86EMUL_UNHANDLEABLE;
>> +
>> +    if ( is_pv_32bit_vcpu(current) )
>> +        addr = (uint32_t)addr;
> 
> This should be based on %cs attributes.  What about a 64bit PV guest in
> a compatability segment?

No - I now realize the conditional is pointless. We only do gate op
emulation for 32-bit guests.

>> +
>> +    if ( !__addr_ok(addr) ||
>> +         (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
>> +    {
>> +        x86_emul_pagefault(goc->insn_fetch && cpu_has_nx
>> +                           ? PFEC_insn_fetch : 0,
>> +                           addr + bytes - rc, ctxt);
> 
> Independently to the point below, the correct predicate for setting
> PFEC_insn_fetch is ((EFER & EFER_NX) || (CR4 & SMEP))

Remember that here we're dealing with PV guests only - those don't
see any SMEP, and always run with NX enabled as long as we do
ourselves.

> However, this is subtly wrong, but I can't think of a sensible way of
> fixing it (i.e. without doing independent pagewalks).
> 
> __copy_from_user() does a data fetch, not an instruction fetch.
> 
> With the advent of PKU, processors support pages which can't be read,
> but can be executed.  Then again, our chances of ever supporting PKU for
> PV guests is slim, so maybe this point isn't very important.
> 
> As we actually do a data read, the error code should remain 0.  This
> (getting a data fetch failure for something which should have been an
> instruction fetch) will be less confusing for the guest than claiming an
> instruction fetch failure when we actually did a data fetch, as at least
> the error code will be correct for the access rights in the translation.

With the above I think this should remain as is.

>> @@ -3350,179 +3431,59 @@ static void emulate_gate_op(struct cpu_u
>>          return;
>>      }
>>  
>> -    op_bytes = op_default = ar & _SEGMENT_DB ? 4 : 2;
>> -    ad_default = ad_bytes = op_default;
>> -    opnd_sel = opnd_off = 0;
>> -    jump = -1;
>> -    for ( eip = regs->eip; eip - regs->_eip < 10; )
>> +    ctxt.ctxt.addr_size = ar & _SEGMENT_DB ? 32 : 16;
>> +    /* Leave zero in ctxt.ctxt.sp_size, as it's not needed for decoding. */
>> +    state = x86_decode_insn(&ctxt.ctxt, gate_op_read);
>> +    ctxt.insn_fetch = false;
>> +    if ( IS_ERR_OR_NULL(state) )
>> +    {
>> +        if ( PTR_ERR(state) != -X86EMUL_EXCEPTION )
>> +            do_guest_trap(TRAP_gp_fault, regs);
> 
> Here, and...
> 
>> -    if ( (opnd_sel != regs->cs &&
>> -          !read_descriptor(opnd_sel, v, &base, &limit, &ar, 0)) ||
>> -         !(ar & _SEGMENT_S) ||
>> -         !(ar & _SEGMENT_P) ||
>> -         ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
>> -    {
>> -        do_guest_trap(TRAP_gp_fault, regs);
>> -        return;
>> -    }
>> +    if ( rc == X86EMUL_EXCEPTION )
> 
> .. and here must unconditionally inject a fault, or the guest will
> livelock by repeatedly taking #GP faults.

The first one needs an else (to deliver the exception indicated by
X86EMUL_EXCEPTION) - this was an oversight during rebase over
your recent changes. And the second one is similar, just that it's
not a missing else (i.e. it's not entirely unconditional to deliver an
exception here; other failure cases are being handled right after
this).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.