[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU



On 10.10.2019 15:12, Roger Pau Monné  wrote:
> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>>>
>>>> Perhaps "unmasked" instead of "the"?
>>>>
>>>>> IO-APIC entries also need to be translated to the new format and added
>>>>> to the interrupt remapping table.
>>>>>
>>>>> This prevents IOMMU interrupt remapping faults when booting on
>>>>> hardware that has unmasked IO-APIC pins.
>>>>
>>>> But in the end it only papers over whatever the spurious interrupts
>>>> result form, doesn't it? Which isn't to say this isn't an
>>>> improvement. Calling out the ExtInt case here may be worthwhile as
>>>> well, as would be pointing out that this case still won't work on
>>>> AMD IOMMUs.
>>>
>>> But the fix for the ExtINT AMD issue should be done in
>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
>>> ExtINT delivery mode, not to this part of the code. I will look
>>> into it, but I think it's kind of tangential to the issue here.
>>
>> I'm not talking of you working on fixing this right away. I'm merely
>> asking that you mention here (a) the ExtInt special case and (b)
>> that this special case will (continue to) not work in the AMD case.
>>
>>>>> --- a/xen/arch/x86/apic.c
>>>>> +++ b/xen/arch/x86/apic.c
>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>>>>>      iommu_enable_x2apic();
>>>>>      __enable_x2apic();
>>>>>  
>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
>>>>>      unmask_8259A();
>>>>>  
>>>>>  out:
>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
>>>>>  
>>>>>  restore_out:
>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>> +    /*
>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has 
>>>>> been
>>>>> +     * enabled during the process, because the entries need to be 
>>>>> translated
>>>>> +     * and added to the remapping table in that case.
>>>>> +     */
>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
>>>>
>>>> How is this different in the resume_x2apic() case? The IOMMU gets
>>>> enabled in the course of that as well. I.e. I'd expect you want
>>>> to pass "false" there, not "true".
>>>
>>> In the resume_x2apic case interrupt remapping should already be
>>> enabled or not, but that function is not going to enable interrupt
>>> remapping if it wasn't enabled before, hence the IO-APIC entries
>>> should already be using the interrupt remapping table and no
>>> translation is needed.
>>
>> Who / what would have enabled the IOMMU in the resume case?
> 
> I don't think the question is who enables interrupt remapping in the
> resume case (which is resume_x2apic when calling iommu_enable_x2apic
> AFAICT), the point here is that on resume the entries in the IO-APIC
> will already match the state of interrupt remapping, so they shouldn't
> be translated. If interrupt remapping was off before suspend it will
> still be off after resume, and there won't be any translation needed.
> The same is true if interrupt remapping is on before suspend.

I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
not prior to suspend.

>>>> Also how would "x2apic_enabled" reflect the transition? It may
>>>> have been "true" already before entering the function here.
>>>
>>> If x2apic_enabled == true at the point where restore_IO_APIC_setup
>>> is called interrupt remapping must be enabled, because AFAICT at this
>>> point it's not possible to have x2apic_enabled == true and interrupt
>>> remapping disabled.
>>>
>>> The issue I can see here is what happens if interrupt remapping was
>>> already enabled by the hardware, and entries in the IO-APIC are
>>> already using the remapping table. I would have to look into how to
>>> detect that case properly, but I think the proposed change is an
>>> improvement overall.
>>
>> Firmware handing off with the IOMMU (and hence interrupt remapping)
>> already enabled is a case which - afaict - we can't currently cope
>> with. Firmware handing off in x2APIC enabled state is typically
>> with the IOMMU (and hence interrupt remapping) still disabled. This
>> is not a forbidden mode, it's just that in such a configuration
>> interrupts can't be delivered to certain CPUs.
>>
>> In any event you need to properly distinguish x2APIC enabled state
>> (or the transition thereof) from IOMMU / interrupt remapping
>> enabled state (or the transition thereof). I.e. you want to avoid
>> raw mode restore if interrupt remapping state transitioned from
>> off to on in the process.
> 
> Right, and that's why the call to restore_IO_APIC_setup in
> x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
> interrupt remapping has been enabled by the call to
> iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
> in the IO-APIC need to be translated to use the interrupt remapping
> table. There's no path that can lead to restore_IO_APIC_setup with
> interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
> enabled and interrupt remapping disabled).
> 
> Hence if interrupt remapping is off before calling x2apic_bsp_setup
> (which is what Xen expects to function properly) and x2apic_enabled ==
> true when calling restore_IO_APIC_setup it means interrupt remapping
> got enabled, and the IO-APIC entries need translation.

But the code in question sits on a shared success/error path, and
in the error case it matters whether x2apic_enabled was true already
on entry. I realize that io_apic_write() would suitably avoid going
the remapping path, but I think it would be more clear if the
distinction was already made properly at the call site.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.