Xen project Mailing List

Re: [Xen-devel] [PATCH 2/4] x86/pv: Introduce pv_create_exception_frame()

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 10 May 2017 01:43:15 -0600

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Wed, 10 May 2017 07:43:27 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 09.05.17 at 19:09, <andrew.cooper3@xxxxxxxxxx> wrote: > On 09/05/17 16:58, Jan Beulich wrote: >>>>> On 08.05.17 at 17:48, <andrew.cooper3@xxxxxxxxxx> wrote: >>> +void pv_create_exception_frame(void) >>> +{ >>> + struct vcpu *curr = current; >>> + struct trap_bounce *tb = &curr->arch.pv_vcpu.trap_bounce; >> const (twice)? >> >>> + struct cpu_user_regs *regs = guest_cpu_user_regs(); >>> + const bool user_mode_frame = !guest_kernel_mode(curr, regs); >>> + uint8_t *evt_mask = &vcpu_info(curr, evtchn_upcall_mask); >>> + unsigned long rflags; >> Does this really need to be "long"? > > The answer to several of these questions are "probably not, but that's > how load_segments() did it". Since you overhaul it, no reason to not do sensible adjustments, I would think. >>> + unsigned int bytes, missing; >>> + >>> + ASSERT_NOT_IN_ATOMIC(); >>> + >>> + if ( unlikely(null_trap_bounce(curr, tb)) ) >>> + { >>> + gprintk(XENLOG_ERR, "Fatal: Attempting to inject null trap >>> bounce\n"); >>> + __domain_crash_synchronous(); >> Why not domain_crash() followed by "return"? > > Because the existing code uses synchronous crashes. > > Looking again at the callsites of pv_create_exception_frame(), we > immediately jump back to {compat_,}test_all_events, which proceeds to > run softirqs again. > > Therefore, domain_crash() and a return should work. (I think?) I think so too. >>> + } >>> + >>> + /* Fold the upcall mask and architectural IOPL into the guests rflags. >>> */ >>> + rflags = regs->rflags & ~(X86_EFLAGS_IF | X86_EFLAGS_IOPL); >>> + rflags |= ((*evt_mask ? 0 : X86_EFLAGS_IF) | >>> + (VM_ASSIST(curr->domain, architectural_iopl) >>> + ? curr->arch.pv_vcpu.iopl : 0)); >>> + >>> + if ( is_pv_32bit_vcpu(curr) ) >>> + { >>> + /* { [ERRCODE,] EIP, CS/MASK , EFLAGS, [ESP, SS] } */ >>> + unsigned int frame[6], *ptr = frame, ksp = >>> + (user_mode_frame ? curr->arch.pv_vcpu.kernel_sp : regs->esp); >>> + >>> + if ( tb->flags & TBF_EXCEPTION_ERRCODE ) >>> + *ptr++ = tb->error_code; >>> + >>> + *ptr++ = regs->eip; >>> + *ptr++ = regs->cs | (((unsigned int)*evt_mask) << 16); >> Do you really need the cast here? > > Does it promote correctly if the top bit of the mask is set? It's a uint8_t -> plain int conversion, which is a zero-extension. >>> + *ptr++ = rflags; >>> + >>> + if ( user_mode_frame ) >>> + { >>> + *ptr++ = regs->esp; >>> + *ptr++ = regs->ss; >>> + } >>> + >>> + /* Copy the constructed frame to the guest kernel stack. */ >>> + bytes = _p(ptr) - _p(frame); >>> + ksp -= bytes; >>> + >>> + if ( unlikely((missing = __copy_to_user(_p(ksp), frame, bytes)) != >>> 0) ) >> While I don't think we need to be really bothered, it's perhaps still >> worth noting in a comment that the wrapping behavior here is >> wrong (and slightly worse than the assembly original), due to >> (implicit) address arithmetic all being done with 64-bit operands. > > Ah - At some point, I had a comment here explaining the lack of an > __access_ok() check, but it appears to have got lost in a rebase. I > will try to reinstate it. Just to avoid misunderstandings - if anything other than a comment, it ought to be __compat_access_ok(). > The wrapping behaviour around the 4GB => 0 boundary is undefined, and > different between Intel and AMD (as we discovered with XSA-186). If we > passing the exception back to the guest we would need to swap #PF for > #SS (for Intel), or properly wrap around (for AMD). > > Would it be ok just to comment this point and leave it as is? Sure - that's what I did ask for. >>> + /* >>> + * Clobber the injection information now it has been completed. Buggy >>> + * attempts to inject the same event twice will hit the >>> null_trap_bounce() >>> + * check above. >>> + */ >>> + *tb = (struct trap_bounce){}; >> Ah, so that prevents tb becoming a pointer to const. I wonder >> though whether, on a rather hot path, we really want to zap the >> entire structure here. As I can see the value in satisfying >> null_trap_bounce(), how about zapping just ->eip / ->cs on the >> split paths above? > > This ends up being two 8-byte writes of zeroes into a cache-hot line; it > isn't by any means a slow part of this path, whereas the 16bit write to > clobber just %cs would be. > > Irrespective of that, the following patch depends on this clobbering of > ->flags. Right, I did see this. Keep it as it is then. >> Overall, did you compare generated code with the current >> assembly implementation? That one surely would have had some >> room for improvement, so the result here at least shouldn't be >> worse than that. > > The final C version (including failsafe, and some error handling the asm > functions didn't have) is a bit less than twice the size of the asm > functions in terms of absolute size. Size alone is no meaningful criteria anyway. > I haven't done any performance analysis, but I trust the compiler to > make better code overall (there are definitely pipeline stalls in the > asm versions), and wouldn't be surprised if it is faster to execute. That's the hope, but if it was double the number on instructions on the common path (we can surely leave error paths out of consideration), I wouldn't be convinced these are faster to execute than the asm code with a couple of stalls (some of which we surely could take care of). If this wasn't on some highly critical paths, I wouldn't be worried, but this code really needs to be fast (besides the very reasonable desire of making it better maintainable). For example, at this point I'm not (yet) convinced making this a single function serving both 64- and 32-bit guests is really the right choice, even if it's only one or two extra conditionals. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.