[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH, RFC] x86/HVM: batch vCPU wakeups
>>> On 09.09.14 at 23:29, <tim@xxxxxxx> wrote: > At 09:33 +0100 on 09 Sep (1410251617), Jan Beulich wrote: >> --- a/xen/arch/x86/hvm/vlapic.c >> +++ b/xen/arch/x86/hvm/vlapic.c >> @@ -447,12 +447,30 @@ void vlapic_ipi( >> >> default: { >> struct vcpu *v; >> - for_each_vcpu ( vlapic_domain(vlapic), v ) >> + const struct domain *d = vlapic_domain(vlapic); >> + bool_t batch = d->max_vcpus > 2 && >> + (short_hand >> + ? short_hand != APIC_DEST_SELF >> + : vlapic_x2apic_mode(vlapic) >> + ? dest_mode ? hweight16(dest) > 1 >> + : dest == 0xffffffff >> + : dest_mode >> + ? hweight8(dest & >> + GET_xAPIC_DEST_FIELD( >> + vlapic_get_reg(vlapic, >> + APIC_DFR))) > 1 >> + : dest == 0xff); > > Yoiks. Could you extract some of this into a helper function, say > is_unicast_dest(), just for readability? Sure. >> void cpu_raise_softirq(unsigned int cpu, unsigned int nr) >> { >> - if ( !test_and_set_bit(nr, &softirq_pending(cpu)) >> - && (cpu != smp_processor_id()) ) >> + unsigned int this_cpu = smp_processor_id(); > > The name clashes with the this_cpu() macro (obviously OK for compiling > but harder to read) and can be dropped since... > >> + >> + if ( test_and_set_bit(nr, &softirq_pending(cpu)) >> + || (cpu == this_cpu) ) >> + return; >> + >> + if ( !per_cpu(batching, this_cpu) || in_irq() ) > > this_cpu(batching) is the preferred way of addressing this on Xen > anyway - though I guess maybe it's _slightly_ less efficient to > recalculate get_cpu_info() here than to indirect through the > per_cpu_offsets[]? I haven't looked at the asm. Right - multiple uses of this_cpu() in the same function are indeed less efficient than latching smp_processor_id() into a local variable. >> smp_send_event_check_cpu(cpu); >> + else >> + set_bit(nr, &per_cpu(batch_mask, this_cpu)); >> +} >> + >> +void cpu_raise_softirq_batch_begin(void) >> +{ >> + ++per_cpu(batching, smp_processor_id()); > > This one should definitely be using this_cpu(). Absolutely - unintended copy-and-paste effect. >> +} >> + >> +void cpu_raise_softirq_batch_finish(void) >> +{ >> + unsigned int cpu, this_cpu = smp_processor_id(); >> + cpumask_t *mask = &per_cpu(batch_mask, this_cpu); > > Again, this_cpu()? No, as again the latched local variable yields better code. >> + ASSERT(per_cpu(batching, this_cpu)); >> + for_each_cpu ( cpu, mask ) >> + if ( !softirq_pending(cpu) ) >> + cpumask_clear_cpu(cpu, mask); >> + smp_send_event_check_mask(mask); >> + cpumask_clear(mask); >> + --per_cpu(batching, this_cpu); >> } > > One other thing that occurs to me is that we might do the batching in > the caller (by gathering a mask of pcpus in the vlapic code) but that > sounds messy. And it's not like softirq is so particularly > latency-sensitive that we'd want to avoid this much overhead in the > common case. So on second thoughts this way is better. :) In fact I had it that way first, and it looked ugly (apart from making it more cumbersome for potential other use sites to switch to this batching model). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |