|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v10 1/9] x86emul: address x86_insn_is_mem_{access, write}() omissions
On 29.05.2020 17:03, Andrew Cooper wrote:
> On 29/05/2020 14:29, Jan Beulich wrote:
>> On 29.05.2020 14:18, Andrew Cooper wrote:
>>> On 25/05/2020 15:26, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> @@ -11474,25 +11474,87 @@ x86_insn_operand_ea(const struct x86_emu
>>>> return state->ea.mem.off;
>>>> }
>>>>
>>>> +/*
>>>> + * This function means to return 'true' for all supported insns with
>>>> explicit
>>>> + * accesses to memory. This means also insns which don't have an explicit
>>>> + * memory operand (like POP), but it does not mean e.g. segment selector
>>>> + * loads, where the descriptor table access is considered an implicit one.
>>>> + */
>>>> bool
>>>> x86_insn_is_mem_access(const struct x86_emulate_state *state,
>>>> const struct x86_emulate_ctxt *ctxt)
>>>> {
>>>> + if ( mode_64bit() && state->not_64bit )
>>>> + return false;
>>> Is this path actually used?
>> Yes, it is. It's only x86_emulate() which has
>>
>> generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD);
>>
>> right now.
>
> Oh. That is a bit awkward.
>
>>> state->not_64bit ought to fail instruction
>>> decode, at which point we wouldn't have a valid state to be used here.
>> x86_decode() currently doesn't have much raising of #UD at all, I
>> think. If it wasn't like this, the not_64bit field wouldn't be
>> needed - it's used only to communicate from decode to execute.
>> We're not entirely consistent with this though, seeing in
>> x86_decode_onebyte(), a few lines below the block of case labels
>> setting that field,
>>
>> case 0x9a: /* call (far, absolute) */
>> case 0xea: /* jmp (far, absolute) */
>> generate_exception_if(mode_64bit(), EXC_UD);
>
> This is because there is no legitimate way to determine the end of the
> instruction.
>
> Most of the not_64bit instructions do have a well-defined end, even if
> they aren't permitted for use.
I wouldn't bet on that. Just look at the re-use of opcode D6
(salc in 32-bit) by L1OM for an extreme example. There's nothing
we can say on how any of the reserved opcodes may get re-used.
Prior to AVX / AVX512 we'd have estimated C4, C5, and 62 wrongly
as well (but that's true independent of execution mode).
>> We could certainly drop the field and raise #UD during decode
>> already, but don't you think we then should do so for all
>> encodings that ultimately lead to #UD, e.g. also for AVX insns
>> without AVX available to the guest? This would make
>> x86_decode() quite a bit larger, as it would then also need to
>> have a giant switch() (or something else that's suitable to
>> cover all cases).
>
> I think there is a semantic difference between "we can't parse the
> instruction", and "we can parse it, but it's not legitimate in this
> context".
>
> I don't think our exact split is correct, but I don't think moving all
> #UD checking into x86_decode() is correct either.
Do you have any clear, sufficiently simple rule in mind for where
we could draw the boundary?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |