|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
On 20/10/2021 11:36, Jan Beulich wrote:
> x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC
> mode (physical vs clustered) depends on iommu_intremap, that variable
> needs to be set to off as soon as we know we can't / won't enable the
> IOMMU, i.e. in particular when
> - parsing of the respective ACPI tables failed,
> - "iommu=off" is in effect, but not "iommu=no-intremap".
> Move the turning off of iommu_intremap from AMD specific code into
> acpi_iommu_init(), accompanying it by clearing of iommu_enable.
>
> Take the opportunity and also skip ACPI table parsing altogether when
> "iommu=off" is in effect anyway.
>
> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> I've deliberately not added a Fixes: tag here, as I'm of the opinion
> that d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier") only
> uncovered a pre-existing anomaly.
I agree it uncovered a pre-existing issue, but that doesn't mean a Fixes
tag should be omitted. That change very concretely regressed booting on
some systems in their pre-existing configuration.
The commit message needs to spell out a link to d8bd82327b0f, but it's
fine to say "that commit broke it by violating an unexpected ordering
dependency, but isn't really the source of the bug".
> This particularly applies to the "iommu=off" aspect.
There should be at least two Fixes tags, but I suspect trying to trace
the history of this mess is not a productive use of time.
> (This now allows me to remove an item from my TODO
> list: I was meaning to figure out why one of my systems wouldn't come
> up properly with "iommu=off", and I had never thought of combining this
> with "no-intremap".)
>
> Arguably "iommu=off" should turn off subordinate features in common
> IOMMU code, but doing so in parse_iommu_param() would be wrong (as
> there might be multiple "iommu=" to parse). This could be placed in
> iommu_supports_x2apic(), but see the next item.
I don't think we make any claim or implication that passing the same
option twice works. The problem here is the nested structure of
options, and the variable doing double duty.
>
> While the change here deals with apic_x2apic_probe() as called from
> x2apic_bsp_setup(), the check_x2apic_preenabled() path looks to be
> similarly affected. That call occurs before acpi_boot_init(), which is
> what calls acpi_iommu_init(). The ordering in setup.c is in part
> relatively fragile, which is why for the moment I'm still hesitant to
> move the generic_apic_probe() call down. Plus I don't have easy access
> to a suitable system to test this case. Thoughts?
I've written these thoughts before, but IOMMU handling it a catastrophic
mess. It needs burning to the ground and redoing from scratch.
We currently have two ways of turning on the IOMMU, depending on what
firmware does, and plenty ways of crashing Xen with cmdline options
which should work, and the legacy xAPIC startup routine is after
interrupts have been enabled, leading to an attempt to rewrite
interrupts in place to remap them. This in particular has lead to XSAs
due to trusting registers which can't be trusted, and the rewrite is
impossible to do safely.
The correct order is:
1) Parse DMAR/IVRS (but do not configure anything), MADT, current APIC
setting and cmdline arguments
2) Figure out whether we want to be in xAPIC or x2APIC mode, and whether
we need intremap. Change the LAPIC setting
3) Configure the IOMMUs. In particular, their interrupt needs to be
after step 2
4) Start up Xen's general IRQ infrastructure.
It's a fair chunk of work, but it will vastly simplify the boot logic
and let us delete the infinite flushing loops out of the IOMMU logic,
and we don't need any logic which has to second guess itself based on
what happened earlier on boot.
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -41,6 +41,23 @@ enum iommu_intremap __read_mostly iommu_
> bool __read_mostly iommu_intpost;
> #endif
>
> +void __init acpi_iommu_init(void)
> +{
> + if ( iommu_enable )
> + {
> + int ret = acpi_dmar_init();
> +
> + if ( ret == -ENODEV )
> + ret = acpi_ivrs_init();
> +
> + if ( ret )
> + iommu_enable = false;
> + }
> +
> + if ( !iommu_enable )
> + iommu_intremap = iommu_intremap_off;
> +}
This does fix my issue, so preferably with the Fixes tag reinstated,
Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Tested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
However, I don't think skipping parsing is a sensible move. Intremap is
utterly mandatory if during boot, we discover that our APIC ID is >254,
and iommu=no-intremap must be ignored in this case, or if the MADT says
we have CPUs beyond that limit and the user hasn't specified nr_cpus=1
or equivalent.
This applies to a future world with a sane boot order, but Xen needs to
know hardware_support_{dma,int}remapping (-> must parse the ACPI tables)
ahead of choosing whether to turn the facilities on or not. Fixing this
removes the double duty from variables.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |