[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen, MCFG acpi table and E820 address map



On Wed, Sep 04, 2013 at 10:16:26AM +0100, Jan Beulich wrote:
> >>> On 04.09.13 at 03:13, Santosh Jodh <Santosh.Jodh@xxxxxxxxxx> wrote:
> > Xen will use information from MCFG acpi table to access PCIe extended 
> > configuration space. However, Xen validates MCFG table by making sure that 
> > the addresses specified in the MCFG table is correctly marked as reserved 
> > in 
> > the E820 address map. If it is not, the MCFG table is ignored - thereby 
> > preventing Xen from accessing PCIe extended configuration space.
> > 
> > I recently came across a workstation class system that supports VT-d. This 
> > system BIOS has a valid MCFG table. The BIOS does NOT report the MCFG 
> > addresses as reserved in the E820 address map. However, the addresses ARE 
> > claimed as reserved via the ACPI motherboard resource devnode (PNP0C01) 
> > mechanism.

Could you tell me what machine this is? It would be good to know to develop
a patch against it.

> > 
> > On this system, Xen ignores the MCFG table as it fails the E820 check. dom0 
> > during early boot does the same thing. However, once ACPI driver is online, 
> > dom0 validates the MCFG table against the motherboard resource devnode and 
> > then accepts the MCFG table. You now end up with a situation where dom0 can 
> > correctly enumerate and use PCIe devices but Xen cannot access extended 
> > configuration space.
> > [...]
> > Is this machine an exception? As we move forward, are we likely to see more 
> > such systems (relying more on ACPI and less on E820 for reserving the MCFG 
> > range)? Is it worth adding a mmcfg=force option to xen to ignore E820 
> > validation result?
> 
> No, this is quite normal, and Xen has code to deal with that
> (PHYSDEVOP_pci_mmcfg_reserved). While our (forward ported)
> kernels have been making use of this since 2.6.27, I suppose
> upstream still doesn't (perhaps not the least because integration
> of this might once again raise concerns about Xen needing code
> changes in too many different places).
> 
> For reference I'm attaching the diff of the respective source file,
> in case someone finds this useful.

Thank you for providing the patch.

Is there a nicer way of doing this? Inserting Xen codepaths right
there is on the high list of no-no.

> 
> Jan
> 

> --- linux-3.11/arch/x86/pci/mmconfig-shared.c
> +++ head/arch/x86/pci/mmconfig-shared.c
> @@ -23,6 +23,10 @@
>  #include <asm/pci_x86.h>
>  #include <asm/acpi.h>
>  
> +#ifdef CONFIG_XEN
> +#include <xen/interface/physdev.h>
> +#endif
> +
>  #define PREFIX "PCI: "
>  
>  /* Indicate if the mmcfg resources have been placed into the resource table. 
> */
> @@ -437,6 +441,36 @@ static int is_acpi_reserved(u64 start, u
>       return mcfg_res.flags;
>  }
>  
> +static int xen_report_mmconf_reserved(const struct pci_mmcfg_region *cfg,
> +                                   int valid, int with_e820)
> +{
> +#ifdef CONFIG_XEN
> +     if (!with_e820) {
> +             struct physdev_pci_mmcfg_reserved r = {
> +                     .address = cfg->address,
> +                     .segment = cfg->segment,
> +                     .start_bus = cfg->start_bus,
> +                     .end_bus = cfg->end_bus,
> +                     .flags = valid ? XEN_PCI_MMCFG_RESERVED : 0
> +             };
> +             int rc;
> +
> +             rc = HYPERVISOR_physdev_op(PHYSDEVOP_pci_mmcfg_reserved, &r);
> +             switch (rc) {
> +             case 0: case -ENOSYS:
> +                     break;
> +             default:
> +                     pr_warn(PREFIX "Failed to report MMCONFIG reservation"
> +                             " state for %04x [bus%02x-%02x] to hypervisor"
> +                             " (%d)\n",
> +                             cfg->segment, cfg->start_bus, cfg->end_bus,
> +                             rc);
> +             }
> +     }
> +#endif
> +     return valid;
> +}
> +
>  typedef int (*check_reserved_t)(u64 start, u64 end, unsigned type);
>  
>  static int __ref is_mmconf_reserved(check_reserved_t is_reserved,
> @@ -456,7 +490,7 @@ static int __ref is_mmconf_reserved(chec
>       }
>  
>       if (size < (16UL<<20) && size != old_size)
> -             return 0;
> +             return xen_report_mmconf_reserved(cfg, 0, with_e820);
>  
>       if (dev)
>               dev_info(dev, "MMCONFIG at %pR reserved in %s\n",
> @@ -488,7 +522,7 @@ static int __ref is_mmconf_reserved(chec
>                               &cfg->res, (unsigned long) cfg->address);
>       }
>  
> -     return 1;
> +     return xen_report_mmconf_reserved(cfg, 1, with_e820);
>  }
>  
>  static int __ref pci_mmcfg_check_reserved(struct device *dev,


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.