[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG



On 28.08.2019 17:24, Igor Druzhinin wrote:
> If MCFG area is not reserved in E820 Xen by default will defer its usage
> until Dom0 registers it explicitly after ACPI parser recognizes it as
> a reserved resource in DSDT. Having it reserved in E820 is not
> mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
> and firmware is free to keep a hole E820 in that place.
> 
> Unfortunately, keeping it disabled at this point makes Xen fail to
> recognize many of PCIe extended capabilities early enough for newly added
> devices. Namely, (1) some of VT-d quirks are not applied during Dom0
> device handoff, (2) currently MCFG reservation report comes
> too late from Dom0 after some of PCI devices being registered in Xen
> by Dom0 kernel that break initialization of a number of PCIe capabilities
> (e.g. SR-IOV VF BAR sizing).
> 
> Since introduction of ACPI parser in Xen is not possible add a "force"
> option that will allow Xen to use MCFG area even it's not properly
> reserved in E820.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> ---
> 
> I think we need to have this option to at least have a way to workaround
> problem (1). Problem (2) could be solved in Dom0 kernel by invoking
> xen_mcfg_late() earlier but before the first PCI bus enumertaion which
> currently happens somwhere during ACPI scan I think.

Indeed (2) should be fixed as you say here. (1) should imo be fixed
differently too: pci_vtd_quirk() (and anything else that relies on
extended config space accesses) should be re-invoked once MCFG becomes
available for a given bus (range).

Nevertheless I think the command line option is a good thing to have.

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - 
> when supported
>  by the platform - DomU with pass-through device assigned).
>  
>  ### mmcfg (x86)
> -> `= <boolean>[,amd-fam10]`
> +> `= List of [ <boolean>, force, amd-fam10 ]`
>  
> -> Default: `1`
> +> Default: `true,force`
>  
> -Specify if the MMConfig space should be enabled.
> +Specify if the MMConfig space should be enabled. If force option is specified
> +(default) MMConfig space will be used by Xen early in boot even if it's
> +not reserved by firmware in the E820 table.

The term "force" is too heavy for my taste: You still require the
range to be at least in an E820 hole. "allow-e820-hole" otoh is a
little longish. Perhaps "relaxed"?

> --- a/xen/arch/x86/x86_64/mmconfig-shared.c
> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c
> @@ -26,33 +26,34 @@
>  
>  #include "mmconfig.h"
>  
> +static bool_t __read_mostly force_mmcfg = true;

Just bool please.

>  unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF;
>  
>  static int __init parse_mmcfg(const char *s)
>  {
>      const char *ss;
> -    int rc = 0;
> +    int val, rc = 0;
>  
>      do {
>          ss = strchr(s, ',');
>          if ( !ss )
>              ss = strchr(s, '\0');
>  
> -        switch ( parse_bool(s, ss) )
> -        {
> -        case 0:
> -            pci_probe &= ~PCI_PROBE_MMCONF;
> -            break;
> -        case 1:
> -            break;
> -        default:
> -            if ( !cmdline_strcmp(s, "amd_fam10") ||
> -                 !cmdline_strcmp(s, "amd-fam10") )
> +        if ( (val = parse_bool(s, ss)) >= 0 ) {
> +            if ( val )
> +               pci_probe |= PCI_PROBE_MMCONF;
> +            else
> +               pci_probe &= ~PCI_PROBE_MMCONF;
> +        } else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 ||
> +                    (val = parse_boolean("amd-fam10", s, ss)) >= 0 ) {
> +            if ( val )
>                  pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF;
>              else
> -                rc = -EINVAL;
> -            break;
> -        }
> +                pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF;
> +        } else if ( (val = parse_boolean("force", s, ss)) >= 0)
> +            force_mmcfg = val;
> +        else
> +            rc = -EINVAL;

Brace placement is wrong in your additions. I also don't really see
why you do away with the switch().

> @@ -355,6 +356,11 @@ static int __init is_mmconf_reserved(
>                     (unsigned int)cfg->start_bus_number,
>                     (unsigned int)cfg->end_bus_number);
>          }
> +    } else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) {
> +        if (!force_mmcfg)
> +            printk(KERN_WARNING "PCI: MCFG area at %lx is not reserved in 
> E820, "
> +                   "use mmcfg=force option\n", addr);

I don't think saying "use" is a good idea. If anything, make it less
strict - either by attaching "if ..." or by saying "consider using".

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.