[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 3/3] Add limited support of VMware's hyper-call
On 09/03/14 04:25, Jan Beulich wrote: On 03.09.14 at 02:55, <dslutz@xxxxxxxxxxx> wrote:On 09/02/14 04:16, Jan Beulich wrote:As I think was said on v1 already - this should be split into smaller pieces if at all possible. I'm therefore not going to do a full review at this time, just give a couple of remarks.The last time (v1's) split was much worse (and so I think that "split into smaller" was not said). I could check be most were "combine this patch with that patch"; and more comment message. Having been over this code many times in the last month, a possible quick split (which would not delay v3 too long) would be: 1) Add simple vmport commands like BDOOR_CMD_GETVERSION 2) Add the hypervisor side of VMware guest info. 3) Add the hvm params that come from VMware guest info. 4) Add the hyper calls for libxc to handle VMware guest info.One thing I'd certainly like to see split out is the save/restore code. And the second - general - consideration would be to split hypervisor from tools side changes as much as possible. I will take that into account. And should I add the unit tests also (as optional)?No idea - that's a tools side thing iirc, and hence a question to the tools maintainers. Will see if I get a response before v3 code changes have been tested and are ready to post. And in any event I'm rather uncomfortable about this getting enabled unconditionally, also due to (but not limited to this) the up front allocation of various memory blocks that may end up never being needed. The main issue I see with this approach is that guests could end up being misguided into thinking they're running on VMware when in fact they have code in place to optimize when running on Xen. We had such detection ordering issues with Linux checking for both Hyper-V and Xen.Ok, I do feel strongly that this would need to be independent on vmware_hw "version". And so I will add a new xl.cfg item to control this. QEMU on 5/19/2014 added a way to turn it's version off: +@item vmport=on|off +Enables emulation of VMWare IO port, for vmmouse etc. (enabled by default) So should I name it vmware_vmport or just vmport ( I lean to just vmport )? and unlike QEMU I am fine with the default of off.I'd prefer the former as being more explicit (vmport alone doesn't really provide a connection to VMware), but this again is something to discuss with the tools maintainers. Ok. @@ -1532,6 +1561,17 @@ int hvm_domain_initialise(struct domain *d) stdvga_deinit(d); vioapic_deinit(d); fail1: + if ( d->arch.hvm_domain.vmport_data ) + { + struct vmport_state *vs = d->arch.hvm_domain.vmport_data; + int idx; + + for (idx = 0; idx < vs->used_guestinfo; idx++)You'll have to go through and fix coding style issues.Do to the many coding styles I have used in the last 20 years, I can have style "blindness". I do not find a style issue here based on CODING_STYLE. All hints are welcome.Just look at the difference between you if() and your for() - the latter is lacking blanks inside the parentheses. And that difference appeared to be a pretty consistent thing throughout the code I looked a little more closely at. like I said I do not see the difference until pointed out. Thank you for this. Will fix all my "for"(s) to this style. @@ -106,6 +107,7 @@ MAKE_INSTR(VMSAVE, 3, 0x0f, 0x01, 0xdb); MAKE_INSTR(STGI, 3, 0x0f, 0x01, 0xdc); MAKE_INSTR(CLGI, 3, 0x0f, 0x01, 0xdd); MAKE_INSTR(INVLPGA,3, 0x0f, 0x01, 0xdf); +MAKE_INSTR(IN, 1, 0xed);This name is ambiguous.There are 4 opcodes and 6 instructions. Not at all sure how to name the one I need. Here is the info about the IN instruction.I'd name it INL_DX, but one of the secondary problems you may need to solve is that unlike for other opcodes you do not want the operand size override ignored here. Ok, I will go with INL_DX. So far I have not found any code that has the operand size prefix. I expect that since all code is in 32 or 64 bit segments and it is expected that 32 bits are used, so the form "IN AX,DX" (which would have the operand size prefix) is not used. In any case, I will do some checking on what I get for this case. I will agree that handling of the operand size prefix is not clear. And may not be needed to be solved at this time. +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs, struct vcpu*v)+{ + struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; + unsigned long inst_len = __get_instruction_length(v, INSTR_IN); + + regs->error_code = vmcb->exitinfo1; + if ( hvm_long_mode_enabled(v) ) + HVMTRACE_LONG_C4D(TRAP_GP, TRAP_gp_fault, inst_len, + regs->error_code, + TRC_PAR_LONG(vmcb->exitinfo2)); + else + HVMTRACE_C4D(TRAP_GP, TRAP_gp_fault, inst_len, + regs->error_code, vmcb->exitinfo2); + + if ( inst_len <= 1 && (regs->rdx & 0xffff) == VMPORT_PORT &&What would inst_len being zero mean here?It use to mean that AMD was missing the feature: (XEN) [2014-08-14 20:17:28] - Next-RIP Saved on #VMEXIT and so I needed to look at the opcode.__get_instruction_length_from_list() returns using the value resulting from that feature only when the length is non-zero.Now that I am using __get_instruction_length() I think it could be changed to a 1. However I can only test on an AMD server that does have this feature. Will switch to == 1 and hope for the best. As far as I can tell __get_instruction_length() does not insure that the opcode list passed is found, just that it might be.Right, and zero is an indication that it wasn't found. Also I just noticed there's a gdprintk() in that event, which for all other cases the function gets used for is quite fine. #GP, however, can result from various instructions, and hence you don't want a message printed each time it wasn't due to "inl %dx, %eax". Will look into how to fix this. + vmcb->exitinfo2 == 0 && regs->error_code == 0 ) + { + uint32_t magic = regs->rax; + + if ( magic == VMPORT_MAGIC )This ought to be folded with the previous if(), at once reducing code redundancy further down in the function.OK. But that does make the debug log message more complex to code.Maybe, albeit (as hinted at before) I question the use of these anyway. Ok. + { + unsigned char bytes[1] = { 0 };Pointless initializer (and it being an array looks suspicious too).Will drop the initializer. hvm_fetch_from_guest_virt_nofault() does take an array. So it is better to say "&tmp, regs->rip, 1" then "bytes, regs->rip, 1" ?Yes, I think so. Fine, will change to that. Also by dropping the initializer, you are stating that hvm_fetch_from_guest_virt_nofault() does initialize it's output on error ( because the debug log output does include bytes[0]).Existing debug log output, or one you add? It's a mistake in any case, I'm just trying to understand whether a fix might be one order independent of your patch. It was in code I added. Speaking of which - why can't you just define a new DBG_LEVEL_VMPORT and use the existing HVM_DBG_LOG()?I did define 23 bits to control various part of the debug. A lot of this is because there are many times you go through the code. Basicly a string is 1st passed a length, and the 4 bytes at a time until the length is handled, followed by state changes...Which again raises the question of the value of all this debugging output. Whether or not a niche feature, I don't think it should be significantly more verbose than the base code we have. I will spend some time to make this have less options. -Don Slutz Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |