[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 28/45] xen: arm: arm64 trap handling.
On Thu, 2013-02-14 at 14:39 +0000, Ian Campbell wrote: > > > +hyp_sync: > > > + entry hyp=1 > > > + msr daifclr, #2 > > > + adr lr, return_to_hypervisor > > > + mov x0, sp > > > + b do_trap_hypervisor > > > > This pattern (call fnX with arg0 == sp and lr == fnY) is repeated > > quite a few times. Could we have another tidying macro for it? > > I'm half considering doing away with the preload lr+b and just using bl > instead and putting the tail stuff in a macro like the entry stuff. > > But if we do stick with this way then sure. I didn't go the exit macro route (still undecided about that) but I did decide to drop the lr+b stuff. I have a feeling that the branch predictor will do better with the simple bl version, which ISTR reading incorporates a hint to the predictor that this is a subroutine call (but I can't find that reference right now. TBH even that smells a bit too much of premature optimisation given there's no actual hardware yet, but the bl version is the straightforward/obvious thing to use so lets go with that. I did experiment with a macro: @@ -96,6 +96,14 @@ lr .req x30 // link register .endm /* + * Call a function, passing sp as the first and only argument + */ + .macro call_with_sp, fn + mov x0, sp + bl \fn + .endm + +/* * Bad Abort numbers *----------------- */ @@ -130,15 +138,14 @@ hyp_error_invalid: hyp_sync: entry hyp=1 msr daifclr, #2 - adr lr, return_to_hypervisor - mov x0, sp - b do_trap_hypervisor + call_with_sp do_trap_hypervisor + b return_to_hypervisor hyp_irq: But TBH I think that looks worse than the open coded: hyp_irq: entry hyp=1 - adr lr, return_to_hypervisor mov x0, sp - b do_trap_irq + bl do_trap_irq + b return_to_hypervisor > > > > > > +ENTRY(return_to_new_vcpu) > > > +ENTRY(return_to_guest) > > > +ENTRY(return_to_hypervisor) > > > + ldp x21, x22, [sp, #UREGS_PC] // load ELR, SPSR > > > + > > > + pop x0, x1 > > > + pop x2, x3 > > > + pop x4, x5 > > > + pop x6, x7 > > > + pop x8, x9 > > > + > > > + /* XXX handle return to guest tasks, soft irqs etc */ > > > + > > > + msr elr_el2, x21 // set up the return data > > > + msr spsr_el2, x22 > > > > Done here becasue it's roughly half-way between the load and the > > overwrite below? Should we be using x28/x29 for this to give ourselves > > even more pipeline space? x21/22 are used as scratch registers in this way in the rest of the file too and I'd rather be consistent about that. As it stands there are 5 instructions either side of this usage, I think that will do, at least in the absence of any useful ability to measure things... Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |