[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 11/17] hvmloader: allocate MMCONFIG area in the MMIO hole


  • To: Thierry Escande <thierry.escande@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 29 Apr 2026 11:29:27 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MbB1E5VlT1DEVQMNTA5bsGnSuAnodCL0j0hXVGI5LAY=; b=g7px2A5qhcw400x57Z7BBdgYX6kOKF0gNyGJYpbFrIVZ+nvsQ+h5z8D9xYRp2dLAD46MPM84yEvkrNlYnTo49axIJuqd3kxJaZ9RKwFVSTkDTEc8t3yfnWJcNCUwEnzBSJ5elFRZMe0cXH8EEC700liXF+517Pm8rzVTReu6bfVaPKuAH9mSfVJIQA0CiGEg5nwNzXKsmxZS7R8TyLOCqAT5TnLoOUPwNGuEkoCs/gmMTjf41LmT6vLYxEb8+LJuM7L08OG024OHcTjtCrPrOjCC9Z0Hbwvtl/jSxG8+97761n8zulom+Fr5KihMeXnNkigRmZWtA6jVPB651NP7Tg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Pp/Fq87Ej7pZmj9EhF9TwVsyz7IG/fUl9SoRMy/MF8QFekkIpoNAhiyIIhlT5xhxcsysbINip2AhYvMpDYRoGpUg3u530ZsDwKNssRsFSEiiXS8tVPTwvZ8xDXwbmtMhx9ktw+S1PXWQZTFslVh9x269XOEN3yaFG57tUa522JppMh58HTrkVkBaytEL8T/y5N7yH/aTuFFluH9J5tyZdl2+XmFr8DWmAK4TyW7T/f7xqGzXKH1W/QqyJAP3ZVw8EMPuBC7nVPg7wIARd5AX14i2jKmOyu5AwJyWgBHX1qR+LlnMuWlW1MmJiF36SeiTS4jv5BdV3v+O3WLxYAFovw==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Alexey Gerasimenko <x1917x@xxxxxxxxx>
  • Delivery-date: Wed, 29 Apr 2026 09:30:01 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Mar 13, 2026 at 04:35:04PM +0000, Thierry Escande wrote:
> The actual MMCONFIG size depends on the number of PCI buses available
> which should be covered by ECAM. Possible options are 64MB, 128MB and
> 256MB.

Are such values inherited from the real q35 impleemntation?

AFAICT the ACPI MCFG spec notes:

"The size of the memory mapped configuration region is indicated by
the start and end bus number fields in the Memory mapped Enhanced
configuration space base address allocation structure as shown in
Table 4-3. 0-255 is the range of allowed bus numbers supported for a
given PCI Segment Group."

So it's in principle possible to specify a MCFG that covers a single
bus, and then it would have a size of 256 * 4K = 1M.  Which avoids
wasting 63M of MMIO space in the low MMIO hole that's already fairly
tight on space.

Is this limitation possibly inherited from the way the ECAM region
position and size must be notified to the chipset?

And further seeing the code below - I found the answer myself, it's
because the chipset only supports negotiation those ECAM sizes.  We
could possibly expose a smaller region in MCFG, but doesn't seem like
a good move.

> As Xen is limited to the bus 0 currently, the lowest possible
> setting is used (64MB), defined via PCI_MAX_MCFG_BUSES in
> hvmloader/config.h. When multiple PCI buses support for Xen will be
> implemented, PCI_MAX_MCFG_BUSES may be replaced by a calculation of the
> number of buses according to PCI devices enumeration.
> 
> The MMCONFIG entry is inserted into bars array in the same manner like
> for any other BARs. In this case, the devfn field will point to MCH PCI
> device and bar_reg will contain PCIEXBAR register offset. It will be
> assigned a slot in the MMIO hole later in a very same way like for plain
> PCI BARs, with respect to its size and alignment. At this point, the
> actual base address and size of the ECAM space are passed to Xen using
> the HVMOP_set_ecam_space hypercall.
> 
> Signed-off-by: Alexey Gerasimenko <x1917x@xxxxxxxxx>
> Signed-off-by: Thierry Escande <thierry.escande@xxxxxxxxxx>
> ---
>  tools/firmware/hvmloader/config.h   |  4 +++
>  tools/firmware/hvmloader/pci.c      | 55 +++++++++++++++++++++++++++++
>  tools/firmware/hvmloader/pci_regs.h |  7 ++++
>  3 files changed, 66 insertions(+)
> 
> diff --git a/tools/firmware/hvmloader/config.h 
> b/tools/firmware/hvmloader/config.h
> index baaed91c7f..aa3158bca5 100644
> --- a/tools/firmware/hvmloader/config.h
> +++ b/tools/firmware/hvmloader/config.h
> @@ -55,6 +55,10 @@ extern uint32_t *cpu_to_apicid;
>  #define PCI_ISA_DEVFN       0x08    /* dev 1, fn 0 */
>  #define PCI_ISA_IRQ_MASK    0x0c20U /* ISA IRQs 5,10,11 are PCI connected */
>  #define PCI_ICH9_LPC_DEVFN  0xf8    /* dev 31, fn 0 */
> +#define PCI_MCH_DEVFN       0       /* bus 0, dev 0, func 0 */
> +
> +/* possible values are: 64, 128, 256 */
> +#define PCI_MAX_MCFG_BUSES  64
>  
>  #define ACPI_TIS_HDR_ADDRESS 0xFED40F00UL
>  
> diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
> index 6e6720adae..54c23ffdd8 100644
> --- a/tools/firmware/hvmloader/pci.c
> +++ b/tools/firmware/hvmloader/pci.c
> @@ -413,6 +413,58 @@ void pci_setup(void)
>          pci_devfn_decode_type[devfn] |= PCI_COMMAND_MASTER;
>      }
>  
> +    /*
> +     *  Calculate MMCONFIG area size and squeeze it into the bars array
> +     *  for assigning a slot in the MMIO hole
> +     */
> +    if ( is_running_on_q35 )
> +    {
> +        /* disable PCIEXBAR decoding for now */
> +        pci_writel(PCI_MCH_DEVFN, PCI_MCH_PCIEXBAR, 0);
> +        pci_writel(PCI_MCH_DEVFN, PCI_MCH_PCIEXBAR + 4, 0);
> +
> +        switch ( PCI_MAX_MCFG_BUSES )
> +        {
> +        case 64:
> +            bar_data = PCIEXBAR_64_BUSES | PCIEXBAR_ENABLE;
> +            bar_sz = MB(64);
> +            break;
> +
> +        case 128:
> +            bar_data = PCIEXBAR_128_BUSES | PCIEXBAR_ENABLE;
> +            bar_sz = MB(128);
> +            break;
> +
> +        case 256:
> +            bar_data = PCIEXBAR_256_BUSES | PCIEXBAR_ENABLE;
> +            bar_sz = MB(256);
> +            break;
> +
> +        default:
> +            /* unsupported number of buses specified */
> +            BUG();
> +        }
> +
> +        addr_mask = ~(bar_sz - 1);
> +
> +        for ( i = 0; i < nr_bars; i++ )
> +            if ( bars[i].bar_sz < bar_sz )
> +                break;
> +
> +        if ( i != nr_bars )
> +            memmove(&bars[i+1], &bars[i], (nr_bars-i) * sizeof(*bars));
> +
> +        bars[i].is_mem    = 1;
> +        bars[i].devfn     = PCI_MCH_DEVFN;
> +        bars[i].bar_reg   = PCI_MCH_PCIEXBAR;
> +        bars[i].bar_sz    = bar_sz;
> +        bars[i].addr_mask = addr_mask;
> +        bars[i].bar_data  = bar_data;
> +
> +        mmio_total += bar_sz;
> +        nr_bars++;
> +    }

I think it might be best if the ECAM fake BAR is the first element in
the bars array, so we ensure it's the first item to consume memory
from the low MMIO hole.  Not sure how that will work with the current
sorting of the resources based on their size, but it's imperative for
hvmloader to attempt to position ECAM ahead of the other device
resources IMO.

> +
>      if ( mmio_hole_size )
>      {
>          uint64_t max_ram_below_4g = GB(4) - mmio_hole_size;
> @@ -592,6 +644,9 @@ void pci_setup(void)
>              }
>          }
>  
> +        if ( bar_reg == PCI_MCH_PCIEXBAR )
> +            hvm_set_ecam_space(base, bar_sz);

As noted in a previous patch, it would be better if it's QEMU (as part
of handling the PCI_MCH_PCIEXBAR writes) that notifies Xen of the ECAM
window placement.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.