Xen project Mailing List

Re: [PATCH] x86/io-apic: fix directed EOI when using AMd-Vi interrupt remapping

To: "Roger Pau Monne" <roger.pau@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: "Alejandro Vallejo" <alejandro.vallejo@xxxxxxxxx>

Date: Mon, 21 Oct 2024 10:55:54 +0100

Cc: "Jan Beulich" <jbeulich@xxxxxxxx>, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>, "Willi Junga" <xenproject@xxxxxx>, "David Woodhouse" <dwmw@xxxxxxxxxxxx>

Delivery-date: Mon, 21 Oct 2024 09:56:03 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri Oct 18, 2024 at 9:08 AM BST, Roger Pau Monne wrote: > When using AMD-VI interrupt remapping the vector field in the IO-APIC RTE is > repurposed to contain part of the offset into the remapping table. Previous > to For my own education. Is that really a repurpose? Isn't the RTE vector field itself simply remapped, just like any MSI? > 2ca9fbd739b8 Xen had logic so that the offset into the interrupt remapping > table would match the vector. Such logic was mandatory for end of interrupt > to > work, since the vector field (even when not containing a vector) is used by > the > IO-APIC to find for which pin the EOI must be performed. > > Introduce a table to store the EOI handlers when using interrupt remapping, so The table seems to store the pre-IR vectors. Is this a matter of nomenclature or leftover from a previous implementation? > that the IO-APIC driver can translate pins into EOI handlers without having to > read the IO-APIC RTE entry. Note that to simplify the logic such table is > used > unconditionally when interrupt remapping is enabled, even if strictly it would > only be required for AMD-Vi. Given that last statement it might be worth mentioning that the table is bypassed when IR is off as well. > > Reported-by: Willi Junga <xenproject@xxxxxx> > Suggested-by: David Woodhouse <dwmw@xxxxxxxxxxxx> > Fixes: 2ca9fbd739b8 ('AMD IOMMU: allocate IRTE entries instead of using a > static mapping') > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > --- > xen/arch/x86/io_apic.c | 47 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 47 insertions(+) > > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c > index e40d2f7dbd75..8856eb29d275 100644 > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -71,6 +71,22 @@ static int apic_pin_2_gsi_irq(int apic, int pin); > > static vmask_t *__read_mostly vector_map[MAX_IO_APICS]; > > +/* > + * Store the EOI handle when using interrupt remapping. That explains the when, but not the what. This is "a LUT from IOAPIC pin to its vector field", as far as I can see. The order in which it's meant to be indexed would be a good addition here as well. I had to scroll down to see how it was used to really see what this was. > + * > + * If using AMD-Vi interrupt remapping the IO-APIC redirection entry remapped > + * format repurposes the vector field to store the offset into the Interrupt > + * Remap table. This causes directed EOI to longer work, as the CPU vector > no > + * longer matches the contents of the RTE vector field. Add a translation > + * table so that directed EOI uses the value in the RTE vector field when nit: Might be worth mentioning that it's a merely cache and is populated on-demand from authoritative state in the IOAPIC. > + * interrupt remapping is enabled. > + * > + * Note Intel VT-d Xen code still stores the CPU vector in the RTE vector > field > + * when using the remapped format, but use the translation table uniformly in > + * order to avoid extra logic to differentiate between VT-d and AMD-Vi. > + */ > +static unsigned int **apic_pin_eoi; This should be signed to allow IRQ_VECTOR_UNASSIGNED, I think. Possibly int16_t, matching arch_irq_desc->vector. This raises doubts about the existing vectors here typed as unsigned too. On naming, I'd rather see ioapic rather than apic, but that's a an existing sin in the whole file. Otherwise, while it's used for EOI ATM, isn't it really just an ioapic_pin_vector? > + > static void share_vector_maps(unsigned int src, unsigned int dst) > { > unsigned int pin; > @@ -273,6 +289,13 @@ void __ioapic_write_entry( > { > __io_apic_write(apic, 0x11 + 2 * pin, eu.w2); > __io_apic_write(apic, 0x10 + 2 * pin, eu.w1); > + /* > + * Might be called before apic_pin_eoi is allocated. Entry will be > + * updated once the array is allocated and there's an EOI or write > + * against the pin. > + */ > + if ( apic_pin_eoi ) > + apic_pin_eoi[apic][pin] = e.vector; > } > else > iommu_update_ire_from_apic(apic, pin, e.raw); > @@ -298,9 +321,17 @@ static void __io_apic_eoi(unsigned int apic, unsigned > int vector, unsigned int p Out of curiosity, how could this vector come to be unassigned as a parameter? The existing code seems to assume that may happen. > /* Prefer the use of the EOI register if available */ > if ( ioapic_has_eoi_reg(apic) ) > { > + if ( apic_pin_eoi ) > + vector = apic_pin_eoi[apic][pin]; > + > /* If vector is unknown, read it from the IO-APIC */ > if ( vector == IRQ_VECTOR_UNASSIGNED ) > + { > vector = __ioapic_read_entry(apic, pin, true).vector; > + if ( apic_pin_eoi ) > + /* Update cached value so further EOI don't need to fetch > it. */ > + apic_pin_eoi[apic][pin] = vector; > + } > > *(IO_APIC_BASE(apic)+16) = vector; > } > @@ -1022,7 +1053,23 @@ static void __init setup_IO_APIC_irqs(void) > > apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n"); > > + if ( iommu_intremap ) > + { > + apic_pin_eoi = xzalloc_array(typeof(*apic_pin_eoi), nr_ioapics); > + BUG_ON(!apic_pin_eoi); > + } > + > for (apic = 0; apic < nr_ioapics; apic++) { Was here before, but it might be a good time to reformat this line and the loop below. > + if ( iommu_intremap ) > + { > + apic_pin_eoi[apic] = xmalloc_array(typeof(**apic_pin_eoi), > + nr_ioapic_entries[apic]); > + BUG_ON(!apic_pin_eoi[apic]); > + > + for ( pin = 0; pin < nr_ioapic_entries[apic]; pin++ ) > + apic_pin_eoi[apic][pin] = IRQ_VECTOR_UNASSIGNED; > + } > + Rather than doing this, we could have a single allocation for everything, and store the different bases accounting for the number of pins of each IOAPIC. apic_pin_eoi[0] = base; for_each_ioapic apic_pin_eoi[i+1] = apic_pin_eoi[i] + nr_ioapic_entries[i]; > for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) { > /* > * add it to the IO-APIC irq-routing table: Cheers, Alejandro

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.