[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/arm: Trap and yield on WFE instructions
On Wed, 2014-07-16 at 15:47 +0530, Anup Patel wrote: > On 16 July 2014 15:39, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote: > > On Wed, 2014-07-16 at 15:37 +0530, Anup Patel wrote: > >> On 16 July 2014 15:06, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote: > >> > On Wed, 2014-07-16 at 14:34 +0530, Anup Patel wrote: > >> >> If we have a Guest/DomU with two or more of its VCPUs running > >> >> on same host CPU then it can quite likely happen that these > >> >> VCPUs fight for same spinlock and one of them will waste CPU > >> >> cycles in WFE instruction. This patch makes WFE instruction > >> >> trap for VCPU and forces VCPU to yield its timeslice. > >> >> > >> >> The KVM ARM/ARM64 also does similar thing for handling WFE > >> >> instructions. (Please refer, > >> >> https://lists.cs.columbia.edu/pipermail/kvmarm/2013-November/006259.html) > >> >> > >> >> In general, this patch is more of an optimization for an > >> >> oversubscribed system having number of VCPUs more than > >> >> underlying host CPUs. > >> > > >> > Sounds good. Do you have any numbers for the improvement? > >> > >> I did not try any benchmarks myself but hackbench shows good > >> improvement for KVM hence it is a worthy optimization for Xen too. > >> > >> I found this change missing for Xen hence this patch. > >> > >> > > >> >> Signed-off-by: Anup Patel <anup.patel@xxxxxxxxxx> > >> >> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@xxxxxxxxxx> > >> >> Tested-by: Pranavkumar Sawargaonkar <pranavkumar@xxxxxxxxxx> > >> >> --- > >> >> xen/arch/arm/traps.c | 27 ++++++++++++++++----------- > >> >> 1 file changed, 16 insertions(+), 11 deletions(-) > >> >> > >> >> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > >> >> index 686d8b7..bc9f67a 100644 > >> >> --- a/xen/arch/arm/traps.c > >> >> +++ b/xen/arch/arm/traps.c > >> >> @@ -90,7 +90,7 @@ void __cpuinit init_traps(void) > >> >> > >> >> /* Setup hypervisor traps */ > >> >> WRITE_SYSREG(HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_FMO|HCR_VM| > >> >> - HCR_TWI|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP, HCR_EL2); > >> >> + HCR_TWE|HCR_TWI|HCR_TSC|HCR_TAC|HCR_SWIO|HCR_TIDCP, > >> >> HCR_EL2); > >> >> isb(); > >> >> } > >> >> > >> >> @@ -1803,16 +1803,21 @@ asmlinkage void do_trap_hypervisor(struct > >> >> cpu_user_regs *regs) > >> >> advance_pc(regs, hsr); > >> >> return; > >> >> } > >> >> - /* at the moment we only trap WFI */ > >> >> - vcpu_block(); > >> >> - /* The ARM spec declares that even if local irqs are masked in > >> >> - * the CPSR register, an irq should wake up a cpu from WFI > >> >> anyway. > >> >> - * For this reason we need to check for irqs that need > >> >> delivery, > >> >> - * ignoring the CPSR register, *after* calling SCHEDOP_block to > >> >> - * avoid races with vgic_vcpu_inject_irq. > >> >> - */ > >> >> - if ( local_events_need_delivery_nomask() ) > >> >> - vcpu_unblock(current); > >> >> + if ( hsr.bits & 0x1 ) { > >> > > >> > Please can you add an appropriate struct hsr_wfe_wfi to union hsr and > >> > use that instead of opencoding bit patterns in hsr.bits. > >> > >> Sure, I will update this and send it right away. > > > > Thanks. > > > >> >> + /* Yield the VCPU for WFE */ > >> >> + vcpu_force_reschedule(current); > >> > > >> > This is OK fine by me, but I'm wondering: Does the HW have any support > >> > for actually blocking until an event occurs in another vcpu (some sort > >> > of trap on sev)? I don't think so, nor am I really sure how such a h/w > >> > feature might work... > >> > >> Typically, on HW the WFE instruction forces cores to go in low power > >> mode. The core will come out of low power mode due to some interrupt > >> or SEV executed from another core. > > > > Right, but for virt in order to put a wfe-ing vcpu to sleep we would > > need to be able to get some sort of hypervisor event when another vcpu > > in the same guest executed sev. > > > > Otherwise the best we can do is to yield as you've done here, but that > > will still keep spinning if there are no other vcpus to run. > > Yes, we cannot trap SEV instruction. Plus it would be unwieldy to have to enable that for all *other* vcpus when one did a WFE. > Yield seems to be the only option here, unless we modify the guest and > replace SEV instruction with a hypercall. Perhaps one day we can add PARAVIRT_SPINLOCKS to arm. But not now ;-) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |