[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC 00/16] Old GIC (gic-vgic) optimizations for GICV2
Hi, On 29/11/2018 07:40, Andrii Anisov wrote: Hello, Again, I sent this cover letter only to myself. So, here it is, hope it does not break the thread. Sorry for the mess. It is seems to be part of the thread. Thank you! From: Andrii Anisov <andrii.anisov@xxxxxxxxx> Sent: Wednesday, November 28, 2018 11:31 PM Cc: Andrii Anisov Subject: [RFC 00/16] Old GIC (gic-vgic) optimizations for GICV2From: Andrii Anisov <andrii_anisov@xxxxxxxx> This patch series is an attempt to reduce IRQ latency with the old GIC implementation (gic-vgic). These patches originally based on XEN 4.10 release. The motivation was to improve benchmark results of a system given to a customer for evaluation. I will repeat what I said on patch #15 and expanding it a bit. This series is touching the old vGIC we plan to decommissioned soon. Amongst may issues with the old vGIC the majors one are the locking is fragile and edge/level interrupts are not handled correctly. Furthermore, a lot of those patches seem to borrow the ideas from the new vGIC. So I think it would be more beneficial to focus on the new vGIC that will be the only vGIC supported soon. This patch series is tailored for GICv2 on RCAR H3. > Several of the most controversial patches (i.e. LRs shadowing) were not shared to the customer, and here are for comments and discussion. We need benchmark and detailed explanation to understand where the latency comes from and how every patches reduce the latency. Ideally I would like to see the performance improvement done patch by patch so we can decide which one are worth it. I hope several patches from here could be upstreamed. Some as is, others with modifications. There are several simple ideas behind these changes: - reduce an excessive code (condition checks) This makes sense. - drop an excessive peripheral register accesses This needs discussion. I will have a look at the patches. - if not reduce, then move an excessive code out of spinlocks - if not drop, then move an excessive register accesses out of spinlocks In general spinlocks should not be contended otherwise you have other issues. Moving code of spinlock is only worth it if that code may return in most of the cases. With porting existing patches, I've got another idea that accessing PER_CPU variables like `current` and `lr_mask` is not really cheap. How come? The resulting code for lr_mask is pretty trivial. C function: uint64_t foo(void) { return this_cpu(lr_mask); } Assembly code: 44: 90000000 adrp x0, 0 <maintenance_interrupt> 48: 91000000 add x0, x0, #0x0 4c: d53cd041 mrs x1, tpidr_el2 50: f8616800 ldr x0, [x0, x1] 54: d65f03c0 ret So this is one memory load. Leaving aside nested virt, TPDIR_EL2 should be fast to access. So it should produce faster code keeping `lr_mask` solely in `struct vcpu` and pass `current` as a function parameter instead of calculation it each time in called functions. So you are going to save 3 instructions... But that's just a drop in the bucket compare to the rest of code. Actually, I would be surprised if you actually notice it on your benchmark. Overall, I think that there are other places to look at wherethe benefits will be much bigger. We can discuss about smaller one if the performance are still low. Cheers, Andrii Anisov (15): hack: drop GIC v3 support vgic: move pause_flags check out of vgic spinlock vgic: move irq_to_pending out of lock gic-vgic: Drop an excessive clear_lrs gic: drop interrupts enabling on interrupts processing gic-vgic:vgic: do not keep disabled IRQs in any of queues gic: separate ppi processing gic-vgic:vgic: avoid excessive conversions gic:vgic:gic-vgic: introduce non-atomic bitops irq: skip action avalability check for guest's IRQ gic-v2: Write HCR only on change gic-vgic: skip irqs locking hack: arm/domain: simplify context restore from idle vcpu hack: move gicv2 LRs reads and writes out of spinlocks gic: vgic: align frequently accessed data by cache line size Julien Grall (1): xen/arm: Re-enable interrupt later in the trap path xen/arch/arm/arm64/entry.S | 11 ++--- xen/arch/arm/configs/arm64_defconfig | 1 + xen/arch/arm/domain.c | 25 ++++++---- xen/arch/arm/gic-v2.c | 63 ++++++++++++++++-------- xen/arch/arm/gic-v3-lpi.c | 2 + xen/arch/arm/gic-v3.c | 2 + xen/arch/arm/gic-vgic.c | 84 +++++++++++++++++++------------- xen/arch/arm/gic.c | 32 +++++++++++-- xen/arch/arm/irq.c | 46 +++++++++++++++--- xen/arch/arm/traps.c | 6 +++ xen/arch/arm/vgic-v3-its.c | 3 +- xen/arch/arm/vgic.c | 93 ++++++++++++++++++++++++++++++------ xen/arch/arm/vgic/vgic.c | 2 + xen/include/asm-arm/config.h | 2 +- xen/include/asm-arm/gic.h | 10 ++-- xen/include/asm-arm/irq.h | 3 ++ xen/include/asm-arm/vgic.h | 24 ++++++---- xen/include/xen/sched.h | 1 + 18 files changed, 299 insertions(+), 111 deletions(-) -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |