Xen project Mailing List

Re: [Xen-devel] [RFC PATCH v2 09/26] ARM: GICv3: introduce separate pending_irq structs for LPIs

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Andre Przywara <andre.przywara@xxxxxxx>

Date: Thu, 12 Jan 2017 19:28:19 +0000

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Julien Grall <julien.grall@xxxxxxx>, Vijay Kilari <vijay.kilari@xxxxxxxxx>

Delivery-date: Thu, 12 Jan 2017 19:27:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, On 12/01/17 19:15, Stefano Stabellini wrote: > On Thu, 12 Jan 2017, Andre Przywara wrote: >> Hi Stefano, >> >> On 05/01/17 21:36, Stefano Stabellini wrote: >>> On Thu, 22 Dec 2016, Andre Przywara wrote: >>>> For the same reason that allocating a struct irq_desc for each >>>> possible LPI is not an option, having a struct pending_irq for each LPI >>>> is also not feasible. However we actually only need those when an >>>> interrupt is on a vCPU (or is about to be injected). >>>> Maintain a list of those structs that we can use for the lifecycle of >>>> a guest LPI. We allocate new entries if necessary, however reuse >>>> pre-owned entries whenever possible. >>>> I added some locking around this list here, however my gut feeling is >>>> that we don't need one because this a per-VCPU structure anyway. >>>> If someone could confirm this, I'd be grateful. >>> >>> I don't think the list should be per-VCPU, because the underlying LPIs >>> are global. >> >> But _pending_ IRQs (regardless of their type) are definitely a per-VCPU >> thing. I consider struct pending_irq something like an LR precursor or a >> software representation of it. >> Also the pending bitmap is a per-redistributor table. >> >> The problem is that struct pending_irq is pretty big, 56 Bytes on arm64 >> if I did the math correctly. So the structs for the 32 private >> interrupts per VCPU alone account for 1792 Byte. Actually I tried to add >> another list head to it to be able to reuse the structure, but that >> broke the build because struct vcpu got bigger than 4K. >> >> So the main reason I went for a dynamic pending_irq approach was that >> the potentially big number of LPIs could lead to a lot of memory to be >> allocated by Xen. And the ITS architecture does not provides any memory >> table (provided by the guest) to be used for storing this information. >> Also ... > > Dynamic pending_irqs are good, but why one list per vcpu, rather than > one list per domain, given that in our current design they can only be > targeting one vcpu at any given time? I believe the specs demands that: one LPI can only be pending at one redistributor for any given point in time, but I will ask Marc tomorrow. I believe it isn't spelled out in the spec, but can be deducted. But as the pending table is per-redistributor, I was assuming that modelling this per VCPU is the right thing. I will think about it, your rationale of the LPIs being global makes some sense ... > >>> Similarly, the struct pending_irq array is per-domain, only >>> the first 32 (PPIs) are per vcpu. Besides, it shouldn't be a list :-) >> >> In reality the number of interrupts which are on a VCPU at any given >> point in time is expected to be very low, in some previous experiments I >> found never more than four. This is even more true for LPIs, which, due >> to the lack of an active state, fall out of the system as soon as the >> VCPU reads the ICC_IAR register. >> So the list will be very short, usually, which made it very appealing to >> just go with a list, especially for an RFC. >> >> Also having potentially thousands of those structures lying around >> mostly unused doesn't sound very clever to me. Actually I was thinking >> about using the same approach for the other interrupts (SPI/PPI/SGI) as >> well, but I guess that doesn't give us much, apart from breaking >> everything ;-) >> >> But that being said: >> If we can prove that the number of LPIs actually is limited (because it >> is bounded by the number of devices, which Dom0 controls), I am willing >> to investigate if we can switch over to using one struct pending_irq per >> LPI. >> Or do you want me to just use a more advanced data structure for that? > > I think we misunderstood each other. I am definitely not suggesting to > have one struct pending_irq per LPI. Your idea of dynamically allocating > them is good. Sorry for the misunderstanding - I am relieved that I don't have to change that ;-) > The things I am concerned about are: > > 1) the choice of a list as a data structure, instead of an hashtable or > a tree (see alpine.DEB.2.10.1610271725280.9978@sstabellini-ThinkPad-X260) > 2) the choice of having one data structure (list or whatever) per vcpu, > rather than one per domain > > In both cases, it's not a matter of opinion but a matter of numbers and > performance. I would like to see some numbers to prove our choices right > or wrong. Got it. I will see what I can do. In the moment my number-and-performance gathering capabilities are severely limited due to me running everything on the model. Cheers, Andre. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.