[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] x86/entry: Introduce POP_GPRS
On 13.03.2024 15:26, Andrew Cooper wrote: > The macro named RESTORE_ALL has several problems. It adjusts the stack > pointer despite this not being clear to the caller. It also goes against > recommendations in the optimisation guides because of trying to do too many > things at once. (i.e. there's a reason why compilers don't emit code looking > like this.) Not anymore; I'm sure they used to over a certain period of time, which is why 4d246723a85a ("x86: use MOV instead of PUSH/POP when saving/restoring register state") was created in the first place (and which you now say was a mistake, or at least has become a mistake in the over 10 years since then). > Introduce a new POP_GPRS macro which only POPs GPRs. Use it for the HVM paths > which are already using POPs. > > Also use it for restore_all_{xen,guest}(). This saves most of a cacheline > worth of code from two fastpaths: > > add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-99 (-99) > Function old new delta > restore_all_guest 378 330 -48 > restore_all_xen 165 114 -51 > > but it also avoids having an explicit modification to the stack pointer > between %rsp-relative accesses, which avoids stalls in the stack-engine > optimisations in some microarchitectures. Is there such a rule? All I was able to find (and even that only with quite a bit of effort, because the section it's in wouldn't have had me think of a stack pointer rule being there) is "Assembly/Compiler Coding Rule 22. (M impact, M generality) Avoid putting explicit references to ESP in a sequence of stack operations (POP, PUSH, CALL, RET)." I actually wonder whether %rsp really is special in the way you indicate - other registers, when used for memory accesses and being updated ought to have a similar stall issue? > --- a/xen/arch/x86/hvm/svm/entry.S > +++ b/xen/arch/x86/hvm/svm/entry.S > @@ -74,22 +74,9 @@ __UNLIKELY_END(nsvm_hap) > ALTERNATIVE "", svm_vmentry_spec_ctrl, X86_FEATURE_SC_MSR_HVM > ALTERNATIVE "", DO_SPEC_CTRL_DIV, X86_FEATURE_SC_DIV > > - pop %r15 > - pop %r14 > - pop %r13 > - pop %r12 > - pop %rbp > mov VCPU_svm_vmcb_pa(%rbx),%rax > - pop %rbx > - pop %r11 > - pop %r10 > - pop %r9 > - pop %r8 > - pop %rcx /* Skip %rax: restored by VMRUN. */ > - pop %rcx > - pop %rdx > - pop %rsi > - pop %rdi > + > + POP_GPRS rax=%rcx /* Skip %rax. It's restored by VMRUN. */ In light of you having asked my to try and decouple ABI and internal stack frame layout, I'm wary of encoding a dependency on the ordering of registers in the frame at a use site like this one. Imo the argument ought to merely indicate "skip %rax", with the macro taking care of how that skipping is actually carried out. > @@ -696,20 +697,19 @@ UNLIKELY_END(exit_cr3) > /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ > SPEC_CTRL_EXIT_TO_XEN /* Req: %r12=ist_exit %r14=end %rsp=regs, > Clob: abcd */ > > - RESTORE_ALL adj=8 > + POP_GPRS > > /* > * When the CPU pushed this exception frame, it zero-extended eflags. > * For an IST exit, SPEC_CTRL_EXIT_TO_XEN stashed shadow copies of > * spec_ctrl_flags and ver_sel above eflags, as we can't use any > GPRs, > * and we're at a random place on the stack, not in a CPUFINFO block. > - * > - * Account for ev/ec having already been popped off the stack. > */ > SPEC_CTRL_COND_VERW \ > - scf=STK_REL(EFRAME_shadow_scf, EFRAME_rip), \ > - sel=STK_REL(EFRAME_shadow_sel, EFRAME_rip) > + scf=STK_REL(EFRAME_shadow_scf, EFRAME_error_code), \ > + sel=STK_REL(EFRAME_shadow_sel, EFRAME_error_code) > > + add $8, %rsp > iretq How is this ADD different from the RESTORE_ALL one, in particular in light of the ORM rule quoted above (which surely extends to IRET as well)? It ought to be possible to avoid, by having POP_GPRS (optionally) move the %r15 value into the error code slot first thing (i.e. before %rsp starts being updated), and then having "pop %r15" last. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |