Xen project Mailing List

Re: [Xen-devel] [PATCH v4 04/16] xen: Add is_vmware_port_enabled

To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>

From: "Slutz, Donald Christopher" <dslutz@xxxxxxxxxxx>

Date: Fri, 19 Sep 2014 13:24:22 +0000

Accept-language: en-US

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Eddie Dong <eddie.dong@xxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Aravind Gopalakrishnan <Aravind.Gopalakrishnan@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

Delivery-date: Fri, 19 Sep 2014 13:25:43 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHPze+FHwBxIXPgm026BWWcnh1OO5v9vIeAgAY4RICAAdJVgIAAKREAgAHdy4CAAPM/AA==

Thread-topic: [Xen-devel] [PATCH v4 04/16] xen: Add is_vmware_port_enabled

On 09/18/14 18:53, Boris Ostrovsky wrote: > On 09/17/2014 02:23 PM, Slutz, Donald Christopher wrote: >> On 09/17/14 11:56, Boris Ostrovsky wrote: >>> On 09/16/2014 08:08 AM, Slutz, Donald Christopher wrote: >>>> On 09/12/14 09:08, Boris Ostrovsky wrote: >>>>> On 09/11/2014 02:36 PM, Don Slutz wrote: >>>>>> int __get_instruction_length_from_list(struct vcpu *v, >>>>>> - const enum instruction_index *list, unsigned int >>>>>> list_count) >>>>>> + const enum instruction_index >>>>>> *list, >>>>>> + unsigned int list_count, >>>>>> + bool_t err_rpt) >>>>>> { >>>>>> struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; >>>>>> unsigned int i, j, inst_len = 0; >>>>>> @@ -211,10 +222,13 @@ int __get_instruction_length_from_list(struct >>>>>> vcpu *v, >>>>>> mismatch: ; >>>>>> } >>>>>> - gdprintk(XENLOG_WARNING, >>>>>> - "%s: Mismatch between expected and actual instruction >>>>>> bytes: " >>>>>> - "eip = %lx\n", __func__, (unsigned long)vmcb->rip); >>>>>> - hvm_inject_hw_exception(TRAP_gp_fault, 0); >>>>>> + if ( err_rpt ) >>>>>> + { >>>>>> + gdprintk(XENLOG_WARNING, >>>>>> + "%s: Mismatch between expected and actual >>>>>> instruction bytes: " >>>>>> + "eip = %lx\n", __func__, (unsigned >>>>>> long)vmcb->rip); >>>>>> + hvm_inject_hw_exception(TRAP_gp_fault, 0); >>>>>> + } >>>>>> return 0; >>>>>> done: >>>>>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c >>>>>> index b5188e6..9e14d2a 100644 >>>>>> --- a/xen/arch/x86/hvm/svm/svm.c >>>>>> +++ b/xen/arch/x86/hvm/svm/svm.c >>>>>> @@ -59,6 +59,7 @@ >>>>>> #include <public/sched.h> >>>>>> #include <asm/hvm/vpt.h> >>>>>> #include <asm/hvm/trace.h> >>>>>> +#include <asm/hvm/vmport.h> >>>>>> #include <asm/hap.h> >>>>>> #include <asm/apic.h> >>>>>> #include <asm/debugger.h> >>>>>> @@ -2065,6 +2066,38 @@ svm_vmexit_do_vmsave(struct vmcb_struct >>>>>> *vmcb, >>>>>> return; >>>>>> } >>>>>> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs, >>>>>> + struct vcpu *v) >>>>>> +{ >>>>>> + struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; >>>>>> + unsigned long inst_len; >>>>>> + unsigned long inst_addr = svm_rip2pointer(v); >>>>>> + int rc; >>>>>> + static const enum instruction_index list[] = { >>>>>> + INSTR_INL_DX, INSTR_INB_DX, INSTR_OUTL_DX, INSTR_OUTB_DX >>>>>> + }; >>>>>> + >>>>>> + inst_len = __get_instruction_length_from_list( >>>>>> + v, list, ARRAY_SIZE(list), 0); >>>>> I should have asked earlier but I don't think I understand why the >>>>> last argument here is 0 (and therefore why you have this last >>>>> argument >>>>> at all). >>>>> >>>>> Because whether or not you are warning in >>>>> __get_instruction_length_from_list() it will still return 0. And >>>>> that, >>>>> in turn, will cause vmport_gp_check() to return an error. And then >>>>> you >>>>> will print another warning in VMPORT_LOG. So there is a warning >>>>> anyway. >>>>> >>>> A key part that you appear to have missed is that >>>> __get_instruction_length_from_list() uses gdprintk(XENLOG_WARNING,... >>>> but VMPORT_DBG_LOG is only available in debug=y builds. So the new >>>> last argument is used to control this. Since this change includes >>>> enabling #GP vmexits, it is now possible for ring 3 users to >>>> generate at >>>> large volume of these which with gdprintk() can flood the console. >>> Would it be possible to decide where and whether to print the warning >>> inside __get_instruction_length_from_list() as opposed to passing a >>> new parameter? E.g. if vmware_port_enabled is set and list includes >>> IN/OUT and possibly something else? >>> >> It could be. However the test is complex and not something I am willing >> to be part of this patch. So the only thing I have come up with is >> using >> #ifndef NDEBUG around the gdprintk(). This leaves the >> >> >> hvm_inject_hw_exception(TRAP_gp_fault, 0); >> >> Which is not correct in this case. Now are far as I can tell, >> calling this >> twice does the wrong thing for this case. I.E. turns it into a double >> fault. >> For #GP I need to call it with vmcb->exitinfo1 instead of 0. >> >> (Note: v5 posted before I got this.) >> >> I just spent some time checking and found out that even with the cpu >> reporting: >> >> (XEN) - Next-RIP Saved on #VMEXIT >> >> svm_nextrip_insn_length(v) is 0. > > As I see it, this could happen for one of three reasons: > * !cpu_has_svm_nrips which can't be the case since (1) family 21 > supports it and (2) you actually see in the log that it does have it > * next RIP is before current RIP which I think can't be the case > neither because we are not looking at branch instruction or something > like that. > * nextrip == rip. Which I don't see how it can be true. > > Can you check why svm_nextrip_insn_length(v) is 0? > What I have found out is that vmcb->nextrip is 0 in my testing. So next RIP is "before current RIP" by a very large amount. > But regardless of that, how do you expect your code to work on CPUs > that don't support NRIP? On those processors you *will* be decoding > the instruction twice. > The code "works" because the only info passed between the 2 decodes is the instruction length. This is used to limit the amount of the 2nd read. And because svm_nextrip_insn_length(v) is 0, all the testing I have done on the AMD server has been doing the decode of the instruction twice. If another CPU is changing the instructions as I read them (which is the security issue as I understand it), all I see happing is that "wrong" direction or size of request could happen, or it is a vmport request or not. All of which either report a #GP or do the vmport action. Since you can do the vmport action without changing the instruction bytes, I do not see how the double decode opens any security issue. I am not trying to say this is good. And as I replied to Jan, I am looking into a way to only do the single decode. The simplest of these is to just not use __get_instruction_length_from_list() and just state that on AMD the instruction length is 2. This is safe because I am only using this to decide much many bytes on the page to read. The 2nd page read read depends on finding a 0x66 prefix byte. I am just noticing that this also has the side effect of not injecting a #PF if the instruction is no longer readable (a side effect of using fetch() which uses hvm_fetch_from_guest_virt()). >> >> (hyper-0-21-54:~>cat /proc/cpuinfo >> processor : 0 >> vendor_id : AuthenticAMD >> cpu family : 21 >> model : 2 >> model name : AMD Opteron(tm) Processor 4365 EE >> stepping : 0 >> microcode : 0x600081c >> ...) >> >> Which means to me that by using __get_instruction_length_from_list() >> I am reading the instruction bytes 2 times. Once in >> __get_instruction_length_from_list() and once in vmport_gp_check(). >> This is not too bad since the rate of these is low. But this does >> suggest to >> me that the change to using __get_instruction_length_from_list() was >> not the best. Having a routine that both checks the instruction and >> reports on which one was found would be much better. >> >> So do I go with v5, or dropping the use of >> __get_instruction_length_from_list() >> in a v6? (The coding of the new routine will take some time.) > > Given what Jan said, it sounds like v5 is not going to work since > you'd be decoding instruction twice, right? > My take is that v5 "works" because the 2nd decode does not use any info from the 1st decode. Sigh, I have to take that back. The check "fetch_len != inst_len" in: if ( bytes[0] == 0x66 ) /* operand size prefix */ { byte_cnt = 2; i = 1; if ( fetch_len != inst_len ) { does violate Jan's statement. The simple change of inst_len to 2 would fix it. As I replied to Jan, I am looking into how to not read and decode more then one time. -Don Slutz > -boris > >> >> -Don Slutz >> >> >>> -boris >>> >>>> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.