|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG
On Wed, Aug 28, 2019 at 04:24:22PM +0100, Igor Druzhinin wrote:
> If MCFG area is not reserved in E820 Xen by default will defer its usage
> until Dom0 registers it explicitly after ACPI parser recognizes it as
> a reserved resource in DSDT. Having it reserved in E820 is not
> mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
> and firmware is free to keep a hole E820 in that place.
>
> Unfortunately, keeping it disabled at this point makes Xen fail to
> recognize many of PCIe extended capabilities early enough for newly added
> devices. Namely, (1) some of VT-d quirks are not applied during Dom0
> device handoff, (2) currently MCFG reservation report comes
> too late from Dom0 after some of PCI devices being registered in Xen
> by Dom0 kernel that break initialization of a number of PCIe capabilities
> (e.g. SR-IOV VF BAR sizing).
>
> Since introduction of ACPI parser in Xen is not possible add a "force"
> option that will allow Xen to use MCFG area even it's not properly
> reserved in E820.
>
> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
Thanks! I think this is the best way to solve the issue.
> ---
>
> I think we need to have this option to at least have a way to workaround
> problem (1). Problem (2) could be solved in Dom0 kernel by invoking
> xen_mcfg_late() earlier but before the first PCI bus enumertaion which
> currently happens somwhere during ACPI scan I think.
>
> The question is what the defult value for this option should be?
Have you tested this against a variety of hardware?
Have you found any box that has a wrong MMCFG area in the MCFG static
table? (ie: one that doesn't match the motherboard resource
reservation in the dynamic ACPI tables?)
>
> ---
> docs/misc/xen-command-line.pandoc | 8 +++++---
> xen/arch/x86/e820.c | 20 ++++++++++++++++++++
> xen/arch/x86/x86_64/mmconfig-shared.c | 34 ++++++++++++++++++++--------------
> xen/include/asm-x86/e820.h | 1 +
> 4 files changed, 46 insertions(+), 17 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.pandoc
> b/docs/misc/xen-command-line.pandoc
> index 7c72e31..f13b10c 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and -
> when supported
> by the platform - DomU with pass-through device assigned).
>
> ### mmcfg (x86)
> -> `= <boolean>[,amd-fam10]`
> +> `= List of [ <boolean>, force, amd-fam10 ]`
>
> -> Default: `1`
> +> Default: `true,force`
>
> -Specify if the MMConfig space should be enabled.
> +Specify if the MMConfig space should be enabled. If force option is specified
> +(default) MMConfig space will be used by Xen early in boot even if it's
> +not reserved by firmware in the E820 table.
>
> ### mmio-relax (x86)
> > `= <boolean> | all`
> diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c
> index 8e8a2c4..edef72a 100644
> --- a/xen/arch/x86/e820.c
> +++ b/xen/arch/x86/e820.c
> @@ -37,6 +37,26 @@ struct e820map e820;
> struct e820map __initdata e820_raw;
>
> /*
> + * This function checks if any part of the range <start,end> is mapped
> + * with type.
> + */
> +int __init e820_any_mapped(u64 start, u64 end, unsigned type)
Please use uint64_t and unsigned int. Also it looks like this
function wants to return a bool instead of an int.
> +{
> + int i;
unsigned int.
> +
> + for (i = 0; i < e820.nr_map; i++) {
Coding style. Some parts of this file are using the Linux coding
style I think, but new additions should be using the Xen coding
style, see e820_change_range_type for example.
> + struct e820entry *ei = &e820.map[i];
const.
> +
> + if (type && ei->type != type)
> + continue;
> + if (ei->addr >= end || ei->addr + ei->size <= start)
> + continue;
> + return 1;
> + }
> + return 0;
> +}
> +
> +/*
> * This function checks if the entire range <start,end> is mapped with type.
> *
> * Note: this function only works correct if the e820 table is sorted and
> diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c
> b/xen/arch/x86/x86_64/mmconfig-shared.c
> index cc08b52..9fc0865 100644
> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
> @@ -26,33 +26,34 @@
>
> #include "mmconfig.h"
>
> +static bool_t __read_mostly force_mmcfg = true;
> unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>
> static int __init parse_mmcfg(const char *s)
> {
> const char *ss;
> - int rc = 0;
> + int val, rc = 0;
>
> do {
> ss = strchr(s, ',');
> if ( !ss )
> ss = strchr(s, '\0');
>
> - switch ( parse_bool(s, ss) )
> - {
> - case 0:
> - pci_probe &= ~PCI_PROBE_MMCONF;
> - break;
> - case 1:
> - break;
> - default:
> - if ( !cmdline_strcmp(s, "amd_fam10") ||
> - !cmdline_strcmp(s, "amd-fam10") )
> + if ( (val = parse_bool(s, ss)) >= 0 ) {
> + if ( val )
> + pci_probe |= PCI_PROBE_MMCONF;
> + else
> + pci_probe &= ~PCI_PROBE_MMCONF;
> + } else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
> + (val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
> + if ( val )
> pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
> else
> - rc = -EINVAL;
> - break;
> - }
> + pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
> + } else if ( (val = parse_boolean("force", s, ss)) >= 0)
> + force_mmcfg = val;
You could also consider adding a new flag to pci_probe, ie:
PCI_PROBE_FORCE_MMCFG.
> + else
> + rc = -EINVAL;
>
> s = ss + 1;
> } while ( *ss );
> @@ -355,6 +356,11 @@ static int __init is_mmconf_reserved(
> (unsigned int)cfg->start_bus_number,
> (unsigned int)cfg->end_bus_number);
> }
> + } else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) {
I think it should be fine to also accept a MMCFG area that's partially
reserved and partially a hole in the memory map?
AFAICT the proposed code will only accept MMCFG regions that are
fully in a memory map hole.
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |