|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping
On 29.10.2024 09:26, Roger Pau Monné wrote:
> On Tue, Oct 29, 2024 at 08:40:26AM +0100, Jan Beulich wrote:
>> On 28.10.2024 18:22, Roger Pau Monné wrote:
>>> On Mon, Oct 28, 2024 at 12:33:42PM +0100, Jan Beulich wrote:
>>>> On 24.10.2024 17:48, Roger Pau Monne wrote:
>>>>> --- a/xen/arch/x86/io_apic.c
>>>>> +++ b/xen/arch/x86/io_apic.c
>>>>> @@ -71,6 +71,24 @@ static int apic_pin_2_gsi_irq(int apic, int pin);
>>>>>
>>>>> static vmask_t *__read_mostly vector_map[MAX_IO_APICS];
>>>>>
>>>>> +/*
>>>>> + * Store the EOI handle when using interrupt remapping.
>>>>> + *
>>>>> + * If using AMD-Vi interrupt remapping the IO-APIC redirection entry
>>>>> remapped
>>>>> + * format repurposes the vector field to store the offset into the
>>>>> Interrupt
>>>>> + * Remap table. This causes directed EOI to longer work, as the CPU
>>>>> vector no
>>>>> + * longer matches the contents of the RTE vector field. Add a
>>>>> translation
>>>>> + * table so that directed EOI uses the value in the RTE vector field when
>>>>> + * interrupt remapping is enabled.
>>>>> + *
>>>>> + * Note Intel VT-d Xen code still stores the CPU vector in the RTE
>>>>> vector field
>>>>> + * when using the remapped format, but use the translation table
>>>>> uniformly in
>>>>> + * order to avoid extra logic to differentiate between VT-d and AMD-Vi.
>>>>> + *
>>>>> + * The matrix is accessed as [#io-apic][#pin].
>>>>> + */
>>>>> +static uint8_t **io_apic_pin_eoi;
>>>>
>>>> Wasn't the conclusion from the v1 discussion that this needs to be a signed
>>>> type wider than 8 bits?
>>>>
>>>>> @@ -298,6 +323,9 @@ static void __io_apic_eoi(unsigned int apic, unsigned
>>>>> int vector, unsigned int p
>>>>> /* Prefer the use of the EOI register if available */
>>>>> if ( ioapic_has_eoi_reg(apic) )
>>>>> {
>>>>> + if ( io_apic_pin_eoi )
>>>>> + vector = io_apic_pin_eoi[apic][pin];
>>>>> +
>>>>> /* If vector is unknown, read it from the IO-APIC */
>>>>> if ( vector == IRQ_VECTOR_UNASSIGNED )
>>>>> vector = __ioapic_read_entry(apic, pin, true).vector;
>>>>
>>>> In addition to what Andrew said here, for this comparison the make sense
>>>> ...
>>>>
>>>>> @@ -1022,7 +1050,20 @@ static void __init setup_IO_APIC_irqs(void)
>>>>>
>>>>> apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
>>>>>
>>>>> + if ( iommu_intremap )
>>>>> + {
>>>>> + io_apic_pin_eoi = xzalloc_array(typeof(*io_apic_pin_eoi),
>>>>> nr_ioapics);
>>>>> + BUG_ON(!io_apic_pin_eoi);
>>>>> + }
>>>>> +
>>>>> for (apic = 0; apic < nr_ioapics; apic++) {
>>>>> + if ( iommu_intremap )
>>>>> + {
>>>>> + io_apic_pin_eoi[apic] =
>>>>> xzalloc_array(typeof(**io_apic_pin_eoi),
>>>>> +
>>>>> nr_ioapic_entries[apic]);
>>>>> + BUG_ON(!io_apic_pin_eoi[apic]);
>>>>> + }
>>>>
>>>> ... doesn't the array also need -1 (== IRQ_VECTOR_UNASSIGNED) filling,
>>>> rather than zero-filling?
>>>
>>> Replying here to both you and Andrews question. My analysis is that
>>> a sentinel is not needed. clear_IO_APIC_pin() is the only function
>>> that calls the EOI routine outside of the irq_desc handlers logic.
>>>
>>> It's used either by clear_IO_APIC(), which gets called before
>>> io_apic_pin_eoi is allocated,
>>
>> Or long after, from disable_IO_APIC().
>>
>>> or by check_timer() and/or
>>> unlock_ExtINT_logic() both of which will perform an
>>> ioapic_write_entry() before the clear_IO_APIC_pin() call.
>>
>> In unlock_ExtINT_logic() I see a call to ioapic_read_entry(), whereas the
>> call to ioapic_write_entry() happens only later. In check_timer() I'm also
>> uncertain a write would occur in _all_ cases. It certainly should occur,
>> or else chances are low that the timer interrupt would actually work. Yet
>> we surely want to avoid making hard to debug corner cases yet more subtle.
>
> Didn't mention it here, but setup_IO_APIC_irqs() will also perform an
> __ioapic_write_entry() call for almost all pins.
>
>>> I've done some XenRT testing with a modified patch that kept the
>>> io_apic_pin_eoi as unsigned int, used the sentinel as init value and
>>> added an assert in __io_apic_eoi() that the value in the array was
>>> never IRQ_VECTOR_UNASSIGNED when the io_apic_pin_eoi was allocated.
>>> This never triggered on any hardware XenRT tested on.
>>>
>>> Maybe this seems to fragile, and you both prefer to keep the sentinel
>>> just in case?
>>
>> Well, how certain are you that this testing in particular covered e.g. all
>> the quirk cases that check_timer() tries to deal with?
>
> That indeed I cannot guarantee, as I don't have coverage figures for
> the XenRT runs.
>
> Seeing as there's too much uncertainty about whether the array will be
> initialized, I will go back to the approach of having a sentinel.
> Andrew suggested to use a 16bit types, would you agree to that or do
> you prefer to use unsigned int as originally proposed?
I too prefer the more narrow type. I'm not overly fussed between int16_t
vs short.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |