|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 04/16] xen: Add is_vmware_port_enabled
On 09/18/14 18:53, Boris Ostrovsky wrote:
> On 09/17/2014 02:23 PM, Slutz, Donald Christopher wrote:
>> On 09/17/14 11:56, Boris Ostrovsky wrote:
>>> On 09/16/2014 08:08 AM, Slutz, Donald Christopher wrote:
>>>> On 09/12/14 09:08, Boris Ostrovsky wrote:
>>>>> On 09/11/2014 02:36 PM, Don Slutz wrote:
>>>>>> int __get_instruction_length_from_list(struct vcpu *v,
>>>>>> - const enum instruction_index *list, unsigned int
>>>>>> list_count)
>>>>>> + const enum instruction_index
>>>>>> *list,
>>>>>> + unsigned int list_count,
>>>>>> + bool_t err_rpt)
>>>>>> {
>>>>>> struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>>>>>> unsigned int i, j, inst_len = 0;
>>>>>> @@ -211,10 +222,13 @@ int __get_instruction_length_from_list(struct
>>>>>> vcpu *v,
>>>>>> mismatch: ;
>>>>>> }
>>>>>> - gdprintk(XENLOG_WARNING,
>>>>>> - "%s: Mismatch between expected and actual instruction
>>>>>> bytes: "
>>>>>> - "eip = %lx\n", __func__, (unsigned long)vmcb->rip);
>>>>>> - hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>>>>> + if ( err_rpt )
>>>>>> + {
>>>>>> + gdprintk(XENLOG_WARNING,
>>>>>> + "%s: Mismatch between expected and actual
>>>>>> instruction bytes: "
>>>>>> + "eip = %lx\n", __func__, (unsigned
>>>>>> long)vmcb->rip);
>>>>>> + hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>>>>> + }
>>>>>> return 0;
>>>>>> done:
>>>>>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>>>>>> index b5188e6..9e14d2a 100644
>>>>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>>>>> @@ -59,6 +59,7 @@
>>>>>> #include <public/sched.h>
>>>>>> #include <asm/hvm/vpt.h>
>>>>>> #include <asm/hvm/trace.h>
>>>>>> +#include <asm/hvm/vmport.h>
>>>>>> #include <asm/hap.h>
>>>>>> #include <asm/apic.h>
>>>>>> #include <asm/debugger.h>
>>>>>> @@ -2065,6 +2066,38 @@ svm_vmexit_do_vmsave(struct vmcb_struct
>>>>>> *vmcb,
>>>>>> return;
>>>>>> }
>>>>>> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>>>>>> + struct vcpu *v)
>>>>>> +{
>>>>>> + struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>>>>>> + unsigned long inst_len;
>>>>>> + unsigned long inst_addr = svm_rip2pointer(v);
>>>>>> + int rc;
>>>>>> + static const enum instruction_index list[] = {
>>>>>> + INSTR_INL_DX, INSTR_INB_DX, INSTR_OUTL_DX, INSTR_OUTB_DX
>>>>>> + };
>>>>>> +
>>>>>> + inst_len = __get_instruction_length_from_list(
>>>>>> + v, list, ARRAY_SIZE(list), 0);
>>>>> I should have asked earlier but I don't think I understand why the
>>>>> last argument here is 0 (and therefore why you have this last
>>>>> argument
>>>>> at all).
>>>>>
>>>>> Because whether or not you are warning in
>>>>> __get_instruction_length_from_list() it will still return 0. And
>>>>> that,
>>>>> in turn, will cause vmport_gp_check() to return an error. And then
>>>>> you
>>>>> will print another warning in VMPORT_LOG. So there is a warning
>>>>> anyway.
>>>>>
>>>> A key part that you appear to have missed is that
>>>> __get_instruction_length_from_list() uses gdprintk(XENLOG_WARNING,...
>>>> but VMPORT_DBG_LOG is only available in debug=y builds. So the new
>>>> last argument is used to control this. Since this change includes
>>>> enabling #GP vmexits, it is now possible for ring 3 users to
>>>> generate at
>>>> large volume of these which with gdprintk() can flood the console.
>>> Would it be possible to decide where and whether to print the warning
>>> inside __get_instruction_length_from_list() as opposed to passing a
>>> new parameter? E.g. if vmware_port_enabled is set and list includes
>>> IN/OUT and possibly something else?
>>>
>> It could be. However the test is complex and not something I am willing
>> to be part of this patch. So the only thing I have come up with is
>> using
>> #ifndef NDEBUG around the gdprintk(). This leaves the
>>
>>
>> hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>
>> Which is not correct in this case. Now are far as I can tell,
>> calling this
>> twice does the wrong thing for this case. I.E. turns it into a double
>> fault.
>> For #GP I need to call it with vmcb->exitinfo1 instead of 0.
>>
>> (Note: v5 posted before I got this.)
>>
>> I just spent some time checking and found out that even with the cpu
>> reporting:
>>
>> (XEN) - Next-RIP Saved on #VMEXIT
>>
>> svm_nextrip_insn_length(v) is 0.
>
> As I see it, this could happen for one of three reasons:
> * !cpu_has_svm_nrips which can't be the case since (1) family 21
> supports it and (2) you actually see in the log that it does have it
> * next RIP is before current RIP which I think can't be the case
> neither because we are not looking at branch instruction or something
> like that.
> * nextrip == rip. Which I don't see how it can be true.
>
> Can you check why svm_nextrip_insn_length(v) is 0?
>
What I have found out is that vmcb->nextrip is 0 in my testing. So
next RIP is "before current RIP" by a very large amount.
> But regardless of that, how do you expect your code to work on CPUs
> that don't support NRIP? On those processors you *will* be decoding
> the instruction twice.
>
The code "works" because the only info passed between the 2 decodes is
the instruction length. This is used to limit the amount of the 2nd read.
And because svm_nextrip_insn_length(v) is 0, all the testing I have done
on the AMD server has been doing the decode of the instruction twice.
If another CPU is changing the instructions as I read them (which is the
security issue as I understand it), all I see happing is that "wrong"
direction
or size of request could happen, or it is a vmport request or not. All of
which either report a #GP or do the vmport action. Since you can do the
vmport action without changing the instruction bytes, I do not see how
the double decode opens any security issue.
I am not trying to say this is good. And as I replied to Jan, I am
looking into
a way to only do the single decode. The simplest of these is to just not
use __get_instruction_length_from_list() and just state that on AMD the
instruction length is 2. This is safe because I am only using this to
decide
much many bytes on the page to read. The 2nd page read read depends
on finding a 0x66 prefix byte. I am just noticing that this also has the
side
effect of not injecting a #PF if the instruction is no longer readable
(a side
effect of using fetch() which uses hvm_fetch_from_guest_virt()).
>>
>> (hyper-0-21-54:~>cat /proc/cpuinfo
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 21
>> model : 2
>> model name : AMD Opteron(tm) Processor 4365 EE
>> stepping : 0
>> microcode : 0x600081c
>> ...)
>>
>> Which means to me that by using __get_instruction_length_from_list()
>> I am reading the instruction bytes 2 times. Once in
>> __get_instruction_length_from_list() and once in vmport_gp_check().
>> This is not too bad since the rate of these is low. But this does
>> suggest to
>> me that the change to using __get_instruction_length_from_list() was
>> not the best. Having a routine that both checks the instruction and
>> reports on which one was found would be much better.
>>
>> So do I go with v5, or dropping the use of
>> __get_instruction_length_from_list()
>> in a v6? (The coding of the new routine will take some time.)
>
> Given what Jan said, it sounds like v5 is not going to work since
> you'd be decoding instruction twice, right?
>
My take is that v5 "works" because the 2nd decode does not use any info
from the 1st decode.
Sigh, I have to take that back. The check "fetch_len != inst_len" in:
if ( bytes[0] == 0x66 ) /* operand size prefix */
{
byte_cnt = 2;
i = 1;
if ( fetch_len != inst_len )
{
does violate Jan's statement. The simple change of inst_len to 2 would
fix it. As I replied to Jan, I am looking into how to not read and decode
more then one time.
-Don Slutz
> -boris
>
>>
>> -Don Slutz
>>
>>
>>> -boris
>>>
>>>>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |