[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/io-apic: fix directed EOI when using AMd-Vi interrupt remapping



On Fri Oct 18, 2024 at 9:08 AM BST, Roger Pau Monne wrote:
> When using AMD-VI interrupt remapping the vector field in the IO-APIC RTE is
> repurposed to contain part of the offset into the remapping table.  Previous 
> to

For my own education. Is that really a repurpose? Isn't the RTE vector field
itself simply remapped, just like any MSI?

> 2ca9fbd739b8 Xen had logic so that the offset into the interrupt remapping
> table would match the vector.  Such logic was mandatory for end of interrupt 
> to
> work, since the vector field (even when not containing a vector) is used by 
> the
> IO-APIC to find for which pin the EOI must be performed.
>
> Introduce a table to store the EOI handlers when using interrupt remapping, so

The table seems to store the pre-IR vectors. Is this a matter of nomenclature
or leftover from a previous implementation?

> that the IO-APIC driver can translate pins into EOI handlers without having to
> read the IO-APIC RTE entry.  Note that to simplify the logic such table is 
> used
> unconditionally when interrupt remapping is enabled, even if strictly it would
> only be required for AMD-Vi.

Given that last statement it might be worth mentioning that the table is
bypassed when IR is off as well.

>
> Reported-by: Willi Junga <xenproject@xxxxxx>
> Suggested-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> Fixes: 2ca9fbd739b8 ('AMD IOMMU: allocate IRTE entries instead of using a 
> static mapping')
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> ---
>  xen/arch/x86/io_apic.c | 47 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
>
> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> index e40d2f7dbd75..8856eb29d275 100644
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -71,6 +71,22 @@ static int apic_pin_2_gsi_irq(int apic, int pin);
>  
>  static vmask_t *__read_mostly vector_map[MAX_IO_APICS];
>  
> +/*
> + * Store the EOI handle when using interrupt remapping.

That explains the when, but not the what. This is "a LUT from IOAPIC pin to its
vector field", as far as I can see. 

The order in which it's meant to be indexed would be a good addition here as
well. I had to scroll down to see how it was used to really see what this was.

> + *
> + * If using AMD-Vi interrupt remapping the IO-APIC redirection entry remapped
> + * format repurposes the vector field to store the offset into the Interrupt
> + * Remap table.  This causes directed EOI to longer work, as the CPU vector 
> no
> + * longer matches the contents of the RTE vector field.  Add a translation
> + * table so that directed EOI uses the value in the RTE vector field when

nit: Might be worth mentioning that it's a merely cache and is populated
on-demand from authoritative state in the IOAPIC.

> + * interrupt remapping is enabled.
> + *
> + * Note Intel VT-d Xen code still stores the CPU vector in the RTE vector 
> field
> + * when using the remapped format, but use the translation table uniformly in
> + * order to avoid extra logic to differentiate between VT-d and AMD-Vi.
> + */
> +static unsigned int **apic_pin_eoi;

This should be signed to allow IRQ_VECTOR_UNASSIGNED, I think. Possibly
int16_t, matching arch_irq_desc->vector. This raises doubts about the existing
vectors here typed as unsigned too.

On naming, I'd rather see ioapic rather than apic, but that's a an existing sin
in the whole file. Otherwise, while it's used for EOI ATM, isn't it really just
an ioapic_pin_vector?

> +
>  static void share_vector_maps(unsigned int src, unsigned int dst)
>  {
>      unsigned int pin;
> @@ -273,6 +289,13 @@ void __ioapic_write_entry(
>      {
>          __io_apic_write(apic, 0x11 + 2 * pin, eu.w2);
>          __io_apic_write(apic, 0x10 + 2 * pin, eu.w1);
> +        /*
> +         * Might be called before apic_pin_eoi is allocated.  Entry will be
> +         * updated once the array is allocated and there's an EOI or write
> +         * against the pin.
> +         */
> +        if ( apic_pin_eoi )
> +            apic_pin_eoi[apic][pin] = e.vector;
>      }
>      else
>          iommu_update_ire_from_apic(apic, pin, e.raw);
> @@ -298,9 +321,17 @@ static void __io_apic_eoi(unsigned int apic, unsigned 
> int vector, unsigned int p

Out of curiosity, how could this vector come to be unassigned as a parameter?
The existing code seems to assume that may happen.

>      /* Prefer the use of the EOI register if available */
>      if ( ioapic_has_eoi_reg(apic) )
>      {
> +        if ( apic_pin_eoi )
> +            vector = apic_pin_eoi[apic][pin];
> +
>          /* If vector is unknown, read it from the IO-APIC */
>          if ( vector == IRQ_VECTOR_UNASSIGNED )
> +        {
>              vector = __ioapic_read_entry(apic, pin, true).vector;
> +            if ( apic_pin_eoi )
> +                /* Update cached value so further EOI don't need to fetch 
> it. */
> +                apic_pin_eoi[apic][pin] = vector;
> +        }
>  
>          *(IO_APIC_BASE(apic)+16) = vector;
>      }
> @@ -1022,7 +1053,23 @@ static void __init setup_IO_APIC_irqs(void)
>  
>      apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
>  
> +    if ( iommu_intremap )
> +    {
> +        apic_pin_eoi = xzalloc_array(typeof(*apic_pin_eoi), nr_ioapics);
> +        BUG_ON(!apic_pin_eoi);
> +    }
> +
>      for (apic = 0; apic < nr_ioapics; apic++) {

Was here before, but it might be a good time to reformat this line and the loop
below.

> +        if ( iommu_intremap )
> +        {
> +            apic_pin_eoi[apic] = xmalloc_array(typeof(**apic_pin_eoi),
> +                                               nr_ioapic_entries[apic]);
> +            BUG_ON(!apic_pin_eoi[apic]);
> +
> +            for ( pin = 0; pin < nr_ioapic_entries[apic]; pin++ )
> +                apic_pin_eoi[apic][pin] = IRQ_VECTOR_UNASSIGNED;
> +        }
> +

Rather than doing this, we could have a single allocation for everything, and
store the different bases accounting for the number of pins of each IOAPIC.

    apic_pin_eoi[0] = base;
    for_each_ioapic
        apic_pin_eoi[i+1] = apic_pin_eoi[i] + nr_ioapic_entries[i];

>          for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) {
>              /*
>               * add it to the IO-APIC irq-routing table:

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.