[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/ioreq: Extend ioreq server to support multiple ioreq pages


  • To: Julian Vetter <julian.vetter@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 9 Feb 2026 17:21:55 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HApS45dSd9cfOXWF5pJL07A/YdVE7EWdCUhex1T1wwU=; b=h0drqIPZFGjkcbVzgttDl3vFCa4my2jjxOmsvtEWu/FZMbSM9TmrmTZ4r6nvkDinfx3uRBosyoEJgOyU1Ngf/XpCBf7w5ybhXlgtiu26uOpIB1a7XjcbYkKVE91CVf/nxlX+HBkyUnZ2hm/ytcnqE2ZntoviCY5QS9OLi+PTB9fenBTLb0PXBQnxXpwtxONk9rT/Fx9nLBlr0NHCuD5VHqkpSiCL54iJbGtdKtRJdaBuBYwwEhsKsNlsLhsMl2TQUU1TMdTkpX0bUfdnG+SQvKGuNdsOYp1UKVTqsLoWEelVtGkUlFihtZLet8kKVn1Zwm0WrwcLUV1unUmVO2B//Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=i/tu++OCsJpAoT9ktrFqMcVYrpmTPjvbLae/G+/r05mhfvBOerMnYPJAqxgOLDg9x17MiTBrQFiMba7Hfp4EC44SNiZlgHPD7eOykKt/xEoD1zt6fIhdrOREWxaWKjXGJvKXOshgV/mXRTssKYzmzTtWzfvKFAMD6NI7PB2wdFyND7j+I7E/zkaDzEc1KVpeUrVodr8wMCvmwVo5il60lNFlbjTBxmh3btIR5cYIE+oJCxEmZuOLzcnFOVACQeG0O6xjXzKSZL6Suo9oEbT5Kq2nETH2FX6/PDJiz2LB7VJhq1oPCzX3+HT8uBnL4m3yqb/xk9sx+G9JyaJbwWklsA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • Delivery-date: Mon, 09 Feb 2026 17:22:16 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 09/02/2026 12:30 pm, Julian Vetter wrote:
> A single shared ioreq page provides PAGE_SIZE/sizeof(struct ioreq) = 128
> slots, limiting HVM guests to 128 vCPUs. To support more vCPUs, extend
> the ioreq server to allocate multiple contiguous ioreq pages based on
> the max number of vCPUs.

This statement about the upper bound is correct, but it doesn't appear
to be what you implement.

> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index f5fd30ce12..13c638db53 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -95,12 +95,15 @@ static struct ioreq_server *get_ioreq_server(const struct 
> domain *d,
>  
>  static ioreq_t *get_ioreq(struct ioreq_server *s, struct vcpu *v)
>  {
> -    shared_iopage_t *p = s->ioreq.va;
> +    unsigned int vcpu_id = v->vcpu_id;
> +    unsigned int page_idx = vcpu_id / IOREQS_PER_PAGE;
> +    unsigned int slot_idx = vcpu_id % IOREQS_PER_PAGE;
> +    shared_iopage_t *p = s->ioreqs.page[page_idx].va;
>  
>      ASSERT((v == current) || !vcpu_runnable(v));
>      ASSERT(p != NULL);
>  
> -    return &p->vcpu_ioreq[v->vcpu_id];
> +    return &p->vcpu_ioreq[slot_idx];

Use xvmalloc_array().  It gives you contiguous VAs in Xen even if the
underlying memory is non-contiguous.

Notably it means you don't need to store the intermediate VAs, and this
final line stays the same.

>  }
>  
>  /*
> @@ -260,84 +263,120 @@ bool vcpu_ioreq_handle_completion(struct vcpu *v)
>  
>  static int ioreq_server_alloc_mfn(struct ioreq_server *s, bool buf)
>  {
> -    struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
>      struct page_info *page;
> +    unsigned int i, j, nr_pages = buf ? 1 : NR_IOREQ_PAGES;

This NR_IOREQ_PAGES needs to not be a constant.  From below, it's:

    DIV_ROUND_UP(HVM_MAX_VCPUS, IOREQS_PER_PAGE)

but wants to be:

    DIV_ROUND_UP(d->max_vcpus, IOREQS_PER_PAGE)

This way, small VMs only allocate one page, and only larger VMs allocate
more.

You probably want a predicate taking d as a parameter, to avoid
opencoding it everywhere.

>  
> -    if ( iorp->page )
> +    for ( i = 0; i < nr_pages; i++ )
>      {
> -        /*
> -         * If a guest frame has already been mapped (which may happen
> -         * on demand if ioreq_server_get_info() is called), then
> -         * allocating a page is not permitted.
> -         */
> -        if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
> -            return -EPERM;
> +        struct ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreqs.page[i];
>  
> -        return 0;
> -    }
> +        if ( iorp->page )
> +        {
> +            /*
> +             * If a guest frame has already been mapped (which may happen
> +             * on demand if ioreq_server_get_info() is called), then
> +             * allocating a page is not permitted.
> +             */
> +            if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
> +                return -EPERM;
> +            continue;  /* Already allocated */

Allocation should be done in one location only, and should allocate
everything needed.

I suspect that most of this complexity will disappear when switching to
xvmalloc_array().

> +        }
>  
> -    page = alloc_domheap_page(s->target, MEMF_no_refcount);
> +        page = alloc_domheap_page(s->target, MEMF_no_refcount);
> +        if ( !page )
> +            goto fail;
>  
> -    if ( !page )
> -        return -ENOMEM;
> +        if ( !get_page_and_type(page, s->target, PGT_writable_page) )
> +        {
> +            /*
> +             * The domain can't possibly know about this page yet, so failure
> +             * here is a clear indication of something fishy going on.
> +             */
> +            put_page_alloc_ref(page);
> +            domain_crash(s->emulator);
> +            return -ENODATA;
> +        }
>  
> -    if ( !get_page_and_type(page, s->target, PGT_writable_page) )
> -    {
> -        /*
> -         * The domain can't possibly know about this page yet, so failure
> -         * here is a clear indication of something fishy going on.
> -         */
> -        domain_crash(s->emulator);
> -        return -ENODATA;
> -    }
> +        /* Assign early so cleanup can find it */
> +        iorp->page = page;
>  
> -    iorp->va = __map_domain_page_global(page);
> -    if ( !iorp->va )
> -        goto fail;
> +        iorp->va = __map_domain_page_global(page);
> +        if ( !iorp->va )
> +            goto fail;
> +
> +        clear_page(iorp->va);
> +    }

As a note for the future, if you are doing a bulk indent/deindent, it's
generally better to do that in a prep patch.  Such patches are trivial
to review (git diff --ignore-all-space returns empty), and it makes the
subsequent change in logic legible.

> - fail:
> -    put_page_alloc_ref(page);
> -    put_page_and_type(page);
> +fail:

For reasons of differing tooling, labels have a minimum of one space
before them.

> diff --git a/xen/include/xen/ioreq.h b/xen/include/xen/ioreq.h
> index e86f0869fa..8604311cb4 100644
> --- a/xen/include/xen/ioreq.h
> +++ b/xen/include/xen/ioreq.h
> @@ -19,9 +19,16 @@
>  #ifndef __XEN_IOREQ_H__
>  #define __XEN_IOREQ_H__
>  
> +#include <xen/lib.h>

xen/macros.h please.  We're trying to remove lib.h

>  #include <xen/sched.h>
>  
>  #include <public/hvm/dm_op.h>
> +#include <public/hvm/hvm_info_table.h>
> +#include <public/hvm/ioreq.h>

Why all 3?

> +
> +/* 4096 / 32 = 128 ioreq slots per page */
> +#define IOREQS_PER_PAGE  (PAGE_SIZE / sizeof(struct ioreq))
> +#define NR_IOREQ_PAGES   DIV_ROUND_UP(HVM_MAX_VCPUS, IOREQS_PER_PAGE)

IOERQS_PER_PAGE is fine to stay, but as indicated, NR_IOREQ_PAGES needs
adjusting.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.