[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 22/23] libxl/arm: Introduce domctl command for IOMMU vSID/vRID mapping


  • To: Milan Djokic <milan_djokic@xxxxxxxx>
  • From: Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Date: Mon, 13 Apr 2026 16:24:43 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 4.158.2.129) smtp.rcpttodomain=epam.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+rHOLqb1g2DktH40JBmSE0Qodwj7cYYJF2/7diWxsgY=; b=rRGAPTDSqBu+XSIrf0mnXxAWVi7mybePECglN0tlBjMJeVAkh6I6aiTzLy/Qo2NoXJptWW2yjBpm2vtn2J/xI/fSO5Int3xMBP+Wr6PIm8ZGPW2fZC6feldSfQ7cOB3KWqDsRCwFgVPMZfC1IvbkIVqSoCSg+A6V0apzlU5juHakrsX9m3XEprVjwRosDdVmnJN15hYDgqAL6k/aQ6Em/aIkL+oQcrrE9EP4Iky4UbfqDDvXLQ6fTxcZcceoCrlsqnwicPMyM26gMOolePigThBr+oG2oKmZETlnAxZzWqJUxGEADBbBj+GPYfjIEZbv34aiZd9nUlimxpZRAHhDnw==
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+rHOLqb1g2DktH40JBmSE0Qodwj7cYYJF2/7diWxsgY=; b=aHEAMjOLe2y+zGKehL+4BRftJiTcr9ucnE3pnT0rNb56cc3FwquzpL0w+uiudpU6NOp94aXwg1sYNTDro+cSpUfSXbOFk4YDb7nLKtcrDQUqRC6h9/bzxj2jM9nOPERhRlA5QxweiPjMNixgW34Q3mavsfHgUjIjMGlsnmzAhQVidl+LJAt4u5q15Wg6g1LpN2+ps5NtcBHq35fIZd8ZU4vmgGmMjsdiJyGeIAU380u7XHmw7PSBvzZiRbSBx6CLkI/61iJaz2cRLXshDwMF48XUeaGC7GADD9WTVA0lMHi+NCsrUKsqUgSvRZILrrgsnUOosPBnukLatR6lcN1yDA==
  • Arc-seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=vOeCsu0vn7rf42EUghtSjZOmhHDDyrGSHSFxQWyN7KLv8znHmrO5WfmHXbWchNMmlyVxGybb9BIihYFafBT4Y7gWOUqYrjoLRVhVyTlG57N9ROra5ohZPYLr+SBZn9hAPu+4ga9ZDuVpZc12bXSis0OFc7FGpKeV284T/G3pQWX1qNtqWpmOVwY6SdnCv1/jJA3D40W0Wg3Ee4sunj6ELxi9sgyjFP71G/OJCdeWKj3hssUzY9j3/lsTNupufBkEChZslWAsMPm8JOlo5FRuZdhixT6gIVxvpfX08wH8oMbaP944tEvOi3ZF5mf4sOIWmXVd0V3ouH0+Y+ErwjFPag==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=f/Q0IzIX+wDBOKbQnJJFagaxAxO2fSLXeVWkiTXc7IcPt04w/pR1WHOxc9tX8Tu2dvzPhwW/Y68x58prA9SIBc8z1Z5pqPxk55tIj2arMJnId9ouLG9gohIP1IVdZPSsLJ4nOZ9ZUxvddhcEK93muBChsO50ALAv4vojrIceos0ENpVZC4bb3S6qRqJKsEV6WM9OJNhYooGDKSQq07tIOtSzO6qpIpsgqOEpi0Lgf6h8ULeph0CUYA/8OfvtbJmDZCh6EBQtw0puN/iYPZK4bufaf3EEiN3FXljP876tLy3BKrnBs9pBdm27gi/Uostu5YL7AyRUje4yTHQtcXDYTQ==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=arm.com header.i="@arm.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"; dkim=pass header.s=selector1 header.d=arm.com header.i="@arm.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 13 Apr 2026 16:26:02 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Thread-index: AQHcy2IISQU6+UC2eUGQqKsvfV2Idg==
  • Thread-topic: [PATCH v3 22/23] libxl/arm: Introduce domctl command for IOMMU vSID/vRID mapping

Hi Milan,

> On 31 Mar 2026, at 02:52, Milan Djokic <milan_djokic@xxxxxxxx> wrote:
> 
> For guests created via control domain (xl, zephyr xenlib), partial device
> tree is parsed and loaded on control domain side.
> SIDs in guests device tree have to be replaced with
> virtual SIDs which are mapped to physical SIDs. In order
> to do that, control domain has to request from Xen to create
> a new vSID and map it to original pSID for every guest device IOMMU
> stream ID. For this purpose, new domctl command 
> (XEN_DOMCTL_viommu_allocate_vid)
> is introduced which control domain can use to request a new vSID mapping and
> insert a new vSID into guest device tree once mapped.
> Requested vSID allocation using this interface for vPCI/DT devices.
> 
> Signed-off-by: Milan Djokic <milan_djokic@xxxxxxxx>
> ---
> tools/include/xenctrl.h             |  12 +++
> tools/libs/ctrl/xc_domain.c         |  23 +++++
> tools/libs/light/libxl_arm.c        | 127 +++++++++++++++++++++++++---
> xen/arch/arm/domctl.c               |  34 ++++++++
> xen/include/public/domctl.h         |  20 +++++
> xen/xsm/flask/hooks.c               |   4 +
> xen/xsm/flask/policy/access_vectors |   2 +
> 7 files changed, 212 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
> index d5dbf69c89..61be892cc8 100644
> --- a/tools/include/xenctrl.h
> +++ b/tools/include/xenctrl.h
> @@ -2659,6 +2659,18 @@ int xc_domain_set_llc_colors(xc_interface *xch, 
> uint32_t domid,
>                              const uint32_t *llc_colors,
>                              uint32_t num_llc_colors);
> 
> +/*
> + * Allocate guest IOMMU vSID and establish its mapping to pSID.
> + * It can only be used on domain DT creation.
> + * Currently used for ARM only, possibly for RISC-V in the
> + * future. Function has no effect for x86.
> + */
> +int xc_domain_viommu_allocate_vsid_range(xc_interface *xch,
> +                                         uint32_t domid,
> +                                         uint16_t nr_sids,
> +                                         uint32_t first_psid,
> +                                         uint32_t *first_vsid);
> +
> #if defined(__arm__) || defined(__aarch64__)
> int xc_dt_overlay(xc_interface *xch, void *overlay_fdt,
>                   uint32_t overlay_fdt_size, uint8_t overlay_op);
> diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
> index 01c0669c88..39ffe80e6d 100644
> --- a/tools/libs/ctrl/xc_domain.c
> +++ b/tools/libs/ctrl/xc_domain.c
> @@ -2222,6 +2222,29 @@ out:
> 
>     return ret;
> }
> +
> +int xc_domain_viommu_allocate_vsid_range(xc_interface *xch,
> +                                         uint32_t domid,
> +                                         uint16_t nr_sids,
> +                                         uint32_t first_psid,
> +                                         uint32_t *first_vsid)
> +{
> +    int err;
> +    struct xen_domctl domctl = {};
> +
> +    domctl.cmd = XEN_DOMCTL_viommu_alloc_vsid_range;
> +    domctl.domain = domid;
> +    domctl.u.viommu_alloc_vsid_range.first_psid = first_psid;
> +    domctl.u.viommu_alloc_vsid_range.nr_sids = nr_sids;
> +
> +    if ( (err = do_domctl(xch, &domctl)) != 0 )
> +        return err;
> +
> +    *first_vsid = domctl.u.viommu_alloc_vsid_range.first_vsid;
> +
> +    return 0;
> +}
> +
> /*
>  * Local variables:
>  * mode: C
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index 7b887898bb..79904b746c 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -955,6 +955,13 @@ static int make_vsmmuv3_node(libxl__gc *gc, void *fdt,
>     return 0;
> }
> 
> +/*
> + * Stores starting vSID of vPCI IOMMU SID range
> + * Used as a lookup value to avoid repeated
> + * vSID range allocation on every fdt resize.
> + */
> +static int vpci_first_vsid = -1;

What happen between one guest creation and another since this and 
viommu_stream_list
are global?

> +
> static int make_vpci_node(libxl__gc *gc, void *fdt,
>                           const struct arch_info *ainfo,
>                           struct xc_dom_image *dom)
> @@ -963,6 +970,9 @@ static int make_vpci_node(libxl__gc *gc, void *fdt,
>     const uint64_t vpci_ecam_base = GUEST_VPCI_ECAM_BASE;
>     const uint64_t vpci_ecam_size = GUEST_VPCI_ECAM_SIZE;
>     const char *name = GCSPRINTF("pcie@%"PRIx64, vpci_ecam_base);
> +    uint16_t iommu_range_size = 0x1000;
> +    uint32_t first_vsid;
> +    uint32_t first_psid = 0;
> 
>     res = fdt_begin_node(fdt, name);
>     if (res) return res;
> @@ -996,8 +1006,20 @@ static int make_vpci_node(libxl__gc *gc, void *fdt,
>         GUEST_VPCI_PREFETCH_MEM_SIZE);
>     if (res) return res;
> 
> +    /* request vSID range allocation if not already allocated */
> +    if (vpci_first_vsid < 0) {
> +        res = xc_domain_viommu_allocate_vsid_range(CTX->xch, 
> dom->guest_domid,
> +            iommu_range_size, first_psid, &first_vsid);
> +        if (res)
> +            return res;
> +        vpci_first_vsid = first_vsid;
> +    }
> +    else {
> +        first_vsid = vpci_first_vsid;
> +    }
> +
>     res = fdt_property_values(gc, fdt, "iommu-map", 4, 0,
> -                              GUEST_PHANDLE_VSMMUV3, 0, 0x10000);
> +                             GUEST_PHANDLE_VSMMUV3, first_vsid, 
> iommu_range_size);
>     if (res) return res;
> 
>     res = fdt_end_node(fdt);
> @@ -1326,11 +1348,92 @@ static int copy_partial_fdt(libxl__gc *gc, void *fdt, 
> void *pfdt)
>     return 0;
> }
> 
> -static int modify_partial_fdt(libxl__gc *gc, void *pfdt)
> +/*
> + * Store virtualized 'iommus' properties for every node attached to IOMMU
> + * and passthroughed to guest.
> + * Used as a lookup table for mapping <phandle pSID> -> <vhandle vSID>
> + */
> +struct viommu_stream {
> +    XEN_LIST_ENTRY(struct viommu_stream) entry;
> +    char path[128];          /* DT path, stable across resizes */
> +    fdt32_t *iommus;         /* fully virtualized iommus property */
> +};
> +
> +static XEN_LIST_HEAD(, struct viommu_stream) viommu_stream_list;
> +
> +/*
> + * Helper function which creates mapping of dt node to
> + * to virtualized 'iommus' property
> + * Mappings stored in a global 'viommu_stream_list' to
> + * make it reusable for every fdt resize
> + */
> +static int viommu_get_stream(libxl__gc *gc,
> +                             uint32_t domid,
> +                             const fdt32_t *prop,
> +                             int proplen,
> +                             const char* path, fdt32_t **iommus)
> +{
> +    int i, r;
> +    uint32_t vsid, psid;
> +    struct viommu_stream *viommu_stream;
> +
> +    /* Lookup if stream for target device is already allocated */
> +    XEN_LIST_FOREACH(viommu_stream, &viommu_stream_list, entry)
> +    {
> +        if (!strcmp(viommu_stream->path, path)) {
> +            *iommus = viommu_stream->iommus;
> +            return 0;
> +        }
> +    }
> +
> +    /* Allocate new viommu stream */
> +    viommu_stream = malloc(sizeof(struct viommu_stream));
> +    if (!viommu_stream)
> +        return ERROR_NOMEM;
> +    memset(viommu_stream, 0, sizeof(struct viommu_stream));
> +    viommu_stream->iommus = malloc(proplen);
> +    if (!viommu_stream->iommus)
> +        return ERROR_NOMEM;

we should free viommu_stream here

> +    memset(viommu_stream->iommus, 0, proplen);
> +
> +    LOG(DEBUG, "Creating vIOMMU stream for device %s",
> +        path);
> +
> +    /*
> +     * Virtualize device "iommus" property
> +     * (replace pIOMMU with vIOMMU phandle and pSIDs with mapped vSIDs)
> +     */
> +    for (i = 0; i < proplen / 8; ++i) {
> +        viommu_stream->iommus[i * 2] = cpu_to_fdt32(GUEST_PHANDLE_VSMMUV3);
> +        /* Allocate new vSID mapped to pSID */
> +        psid = fdt32_to_cpu(prop[i * 2 + 1]);
> +        r = xc_domain_viommu_allocate_vsid_range(CTX->xch, domid, 1, psid, 
> &vsid);
> +        if (r) {
> +            LOG(ERROR, "Can't allocate new vSID/vRID for guest IOMMU 
> device");
> +            return r;
> +        }
> +        viommu_stream->iommus[i * 2 + 1] = cpu_to_fdt32(vsid);
> +        LOG(DEBUG, "Mapped vSID: %u to pSID: %u", vsid, psid);
> +    }
> +
> +    strcpy(viommu_stream->path, path);
> +    *iommus =  viommu_stream->iommus;
> +
> +    XEN_LIST_INSERT_HEAD(&viommu_stream_list, viommu_stream, entry);
> +
> +    return 0;
> +}
> +
> +/*
> + * Used to update partial fdt when vIOMMU is enabled
> + * Maps dt properties of IOMMU devices to virtual IOMMU
> + */
> +static int viommu_modify_partial_fdt(libxl__gc *gc, void *pfdt, uint32_t 
> domid)
> {
> -    int nodeoff, proplen, i, r;
> +    int nodeoff, proplen, r;
>     const fdt32_t *prop;
>     fdt32_t *prop_c;
> +    char path[128];
> 
>     nodeoff = fdt_path_offset(pfdt, "/passthrough");
>     if (nodeoff < 0)
> @@ -1344,11 +1447,16 @@ static int modify_partial_fdt(libxl__gc *gc, void 
> *pfdt)
>         if (!prop)
>             continue;
> 
> -        prop_c = libxl__zalloc(gc, proplen);
> +        r = fdt_get_path(pfdt, nodeoff, path, sizeof(path));
> +        if ( r < 0 ) {
> +            LOG(ERROR, "Can't get passthrough node path");
> +            return r;
> +        }
> 
> -        for (i = 0; i < proplen / 8; ++i) {
> -            prop_c[i * 2] = cpu_to_fdt32(GUEST_PHANDLE_VSMMUV3);
> -            prop_c[i * 2 + 1] = prop[i * 2 + 1];
> +        r = viommu_get_stream(gc, domid, prop, proplen, path, &prop_c);
> +        if (r) {
> +            LOG(ERROR, "Can't get viommu stream");
> +            return r;
>         }
> 
>         r = fdt_setprop(pfdt, nodeoff, "iommus", prop_c, proplen);
> @@ -1360,7 +1468,6 @@ static int modify_partial_fdt(libxl__gc *gc, void *pfdt)
> 
>     return 0;
> }
> -
> #else
> 
> static int check_partial_fdt(libxl__gc *gc, void *fdt, size_t size)
> @@ -1379,7 +1486,7 @@ static int copy_partial_fdt(libxl__gc *gc, void *fdt, 
> void *pfdt)
>     return -FDT_ERR_INTERNAL;
> }
> 
> -static int modify_partial_fdt(libxl__gc *gc, void *pfdt)
> +static int viommu_modify_partial_fdt(libxl__gc *gc, void *pfdt, uint32_t 
> domid)
> {
>     LOG(ERROR, "partial device tree not supported");
> 
> @@ -1511,7 +1618,7 @@ next_resize:
>         if (info->arch_arm.viommu_type == LIBXL_VIOMMU_TYPE_SMMUV3) {
>             FDT( make_vsmmuv3_node(gc, fdt, ainfo, dom) );
>             if (pfdt)
> -                FDT( modify_partial_fdt(gc, pfdt) );
> +                FDT( viommu_modify_partial_fdt(gc, pfdt, dom->guest_domid) );
>         }
> 
>         for (i = 0; i < d_config->num_disks; i++) {
> diff --git a/xen/arch/arm/domctl.c b/xen/arch/arm/domctl.c
> index ad914c915f..c85853e4cb 100644
> --- a/xen/arch/arm/domctl.c
> +++ b/xen/arch/arm/domctl.c
> @@ -16,6 +16,7 @@
> #include <xen/types.h>
> #include <xsm/xsm.h>
> #include <public/domctl.h>
> +#include <asm/viommu.h>
> 
> void arch_get_domain_info(const struct domain *d,
>                           struct xen_domctl_getdomaininfo *info)
> @@ -179,6 +180,39 @@ long arch_do_domctl(struct xen_domctl *domctl, struct 
> domain *d,
>     }
>     case XEN_DOMCTL_dt_overlay:
>         return dt_overlay_domctl(d, &domctl->u.dt_overlay);
> +
> +#ifdef CONFIG_ARM_VIRTUAL_IOMMU
> +    case XEN_DOMCTL_viommu_alloc_vsid_range:
> +    {
> +        int rc = 0;
> +        uint16_t i;
> +        uint32_t vsid;
> +        struct xen_domctl_viommu_alloc_vsid_range *viommu_alloc_vsid_range =
> +            &domctl->u.viommu_alloc_vsid_range;
> +
> +        if ( viommu_alloc_vsid_range->pad )
> +            return -EINVAL;
> +
> +        for ( i = 0; i < viommu_alloc_vsid_range->nr_sids; i++ )
> +        {
> +            rc = viommu_allocate_free_vid(d, 
> viommu_alloc_vsid_range->first_psid
> +                                            + i, &vsid);

Since we are using like this, wouldn’t have been better to have an 
“viommu_allocate_free_vid_range()”?

> +            if( rc )
> +                return rc;
> +        }
> +
> +        if ( !rc )
> +        {
> +            /* Calculate first vSID from allocated range */
> +            viommu_alloc_vsid_range->first_vsid = vsid -
> +                viommu_alloc_vsid_range->nr_sids + 1;
> +            rc = copy_to_guest(u_domctl, domctl, 1);

we should use something like:
rc = copy_to_guest(u_domctl, domctl, 1);
if ( rc )
    rc = -EFAULT;

Cheers,
Luca


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.