[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 12/16] xen: implement new foreign copy hypercall


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Frediano Ziglio <freddy77@xxxxxxxxx>
  • Date: Mon, 15 Jun 2026 16:07:32 +0100
  • Arc-authentication-results: i=1; mx.google.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=CndTcKfCH82aUKj/f1b9neCoIs2mCiGx0N/nrnCEwEQ=; fh=OtBKR5Ep8z1KYzMIrXbXTC77nuY2r0q5Jx/RIZj9FDQ=; b=ZCrpbivbct07BhXxqxBuZ6aL7OojHpl/lpuuobTDcgALfMgL9vfP5V2O0dfQWYDW6n P6H3xAXSI0WiExOPLsEI9PTgmh21kEZzrfCOfifimnblWz562C7HpsQTDJv8iv71tFZf 3s2ruYuXK1DMaJc8H9kLzj4CHkRhi/1axFCmC8HPZXtGhPiIcAOqqxKpkCQE+Vx6wH0j nhj2B0vMa5a9oNpzrO412o9lyJtNdrTnUaIn067zGnBpjGuqWZ9w16EIbbGcpjEMPY6/ L0y4V4VDsFvnE+r4wXc1abdDZMxdr6PfRXu4LD+SKmj/UQvaoMMzh6q7anN7EQikRB/a 4YBA==; darn=lists.xenproject.org
  • Arc-seal: i=1; a=rsa-sha256; t=1781536064; cv=none; d=google.com; s=arc-20240605; b=eSMqIUotVPNeanjukE2NmekFRtcepv24JkR0abbMAPiRGIsWrN4dW/GwF2BUstbOtY eUbqYepCymSN0ltL6MyARureynVUNm5USEbM/dusIYyQIYsuzIacglox4QBO/O37Qqqd IqKI2t4R50ZlorwtWWK9GcUZLbzjAYPeo3oYECz7VlkgAKbWvu/9YXm2Tm0TZeADsKsg lbMewea42ojjFGVZiH+bg/GjSRZjjqacUTK6U75gYMBQVmUHsqctsGrGRJJnr+64Mk2F Zk9UpxHiz3bRJsfh+c8/V3a32bu+8cnI52TmlnyD24XI0kJ2kNLO/YkdezRcznmtNLtR WIEw==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=20251104 header.d=gmail.com header.i="@gmail.com" header.h="Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References:MIME-Version"
  • Cc: Frediano Ziglio <frediano.ziglio@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Daniel Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 15 Jun 2026 15:07:49 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, 15 Jun 2026 at 15:03, Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 15.06.2026 14:11, Frediano Ziglio wrote:
> > On Mon, 15 Jun 2026 at 08:41, Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >> On 13.06.2026 23:47, Frediano Ziglio wrote:
> >>> --- a/xen/common/memory.c
> >>> +++ b/xen/common/memory.c
> >>> @@ -1545,6 +1545,133 @@ static int acquire_resource(
> >>>      return rc;
> >>>  }
> >>>
> >>> +/*
> >>> + * The "noinline" qualifier avoids the compiler to create a large 
> >>> function
> >>> + * consuming quite a lot of stack.
> >>> + */
> >>> +static int noinline mem_foreigncopy(
> >>> +    XEN_GUEST_HANDLE_PARAM(xen_foreigncopy_t) arg)
> >>> +{
> >>> +    struct domain *d, *const currd = current->domain;
> >>> +    xen_foreigncopy_t copy;
> >>> +    int rc, direction;
> >>> +
> >>> +    if ( copy_from_guest(&copy, arg, 1) )
> >>> +        return -EFAULT;
> >>> +
> >>> +    if ( copy.flags & ~1U )
> >>> +        return -EINVAL;
> >>> +
> >>> +    direction = copy.flags & XENMEM_foreigncopy_direction;
> >>> +
> >>> +    if ( copy.nr_frames == 0 )
> >>> +        return 0;
> >>> +
> >>> +    rc = rcu_lock_remote_domain_by_id(copy.domid, &d);
> >>> +    if ( rc )
> >>> +        return rc;
> >>
> >> Extending my v4 remark: How about a "fully foreign" copy? I.e. one between 
> >> two
> >> pages in two foreign domains? Getting me back also to the un-answered v4
> >> question of mine as to MMUEXT_COPY_PAGE.
> >
> > I really thought I replied to this but I cannot find it.
> > MMUEXT_COPY_PAGE was the first attempt to do this but mmuext is
> > designed for PV and extending was made the code confusing.
>
> We discussed this on the x86 call, and yes - the major op being PV-only is
> getting in the way here.
>

Added a comment in the commit message:

    Extending MMUEXT_COPY_PAGE seems better on first sight but considering
    that MMUEXT is meant for PV only and trying to change that sub-op this
    solution is better.

> >> Further, as to the order of checks: I'm not going to insist on 
> >> re-ordering, yet
> >> I'd like to point out that copying 0 pages to/from a bad domid might better
> >> yield an error.
> >
> > Not strong about this, changed to return -EINVAL. Reordering after
> > this change won't make much sense, -EINVAL is returned both for wrong
> > flags or no frames.
>
> Please don't - copying 0 frames with all other arguments correct is not an
> error.
>

This comment seems to contradict  the not insisting  on re-ordering.
Changed the order so to check the domid before nr_frames == 0 :

    rc = rcu_lock_remote_domain_by_id(copy.domid, &d);
    if ( rc )
        return rc;

    if ( copy.nr_frames == 0 )
    {
        rcu_unlock_domain(d);
        return 0;
    }

> >>> +    do {
> >>> +        /*
> >>> +         * Arbitrary size.  Not too much stack space, and a reasonable 
> >>> stride
> >>> +         * for continuation checks.
> >>> +         */
> >>> +        xen_pfn_t gfn_list[32];
> >>> +        unsigned int todo = MIN(ARRAY_SIZE(gfn_list), copy.nr_frames);
> >>> +
> >>> +        rc = -EFAULT;
> >>> +        if ( copy_from_guest(gfn_list, copy.frame_list, todo) )
> >>> +            goto out;
> >>> +
> >>> +        for ( unsigned i = 0; i < todo; i++ )
> >>
> >> Nit: "unsigned int" please (like you have it above).
> >
> > Changed. Note that "long" is also not a type but a modifier. Would it
> > be good to change both the above to "unsigned" instead for consistency
> > with "long" ?
>
> No, the way we spell types has historical reasons and isn't tied to the
> exact lexical meaning of the keywords.
>
> >>> +        {
> >>> +            struct page_info *foreign_page;
> >>> +            void *foreign;
> >>> +            p2m_type_t p2mt;
> >>> +            const unsigned long valid_mask =
> >>> +#ifdef CONFIG_X86
> >>> +                p2m_to_mask(p2m_ram_rw) | p2m_to_mask(p2m_ram_logdirty);
> >>> +#else
> >>> +                p2m_to_mask(p2m_ram_rw);
> >>> +#endif
> >>
> >> What about, for example, p2m_ram_ro? Or p2m_ram_shared? Or p2m_grant_map_*?
> >> Etc. Any artificial constraining wants justifying in the description and/or
> >> mentioning in the public header.
> >
> > The base of this was taken from migration code where there is such a check.
> > I suppose that adding p2m_ram_ro (where available) won't hurt.
>
> Just to mention, to avoid another round trip just because of this: p2m_ram_ro
> has different meaning on x86 vs Arm/RISC-V.
>

That's confusing... should not this be fixed somehow?
It won't save much from a round-trip. Should I allow it or not ?

> > p2m_ram_shared I'm not sure but seems fine too.
> > For p2m_grant_map_* it feels a bit a security issue to me. It would
> > allow a guest to give access to pages of other domains. It's true that
> > the current domain would have to have write access to this domain
> > anyway but extend these permissions sounds something it should not be
> > able to do.
>
> It could copy the contents of the grant mapped page by other means. Why not
> allow it in this new sub-op as well then?
>

I'm more afraid of writing the content of the grant pages than copying it.

> Talking of security: When the page you copy to is owned by a PV guest, I
> think you further need to obtain a PGT_writable type ref. (Of course it then
> likely is easier to always do this, not just for PV.)
>

Wondering how save/restore (or migration) works in this case.

> >>> @@ -2012,6 +2139,13 @@ long do_memory_op(unsigned long cmd, 
> >>> XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>              start_extent);
> >>>          break;
> >>>
> >>> +    case XENMEM_foreigncopy:
> >>> +        if ( unlikely(start_extent) )
> >>> +            return -EINVAL;
> >>
> >> Please address review comments (verbally or by code changes) before 
> >> submitting
> >> a new version. Here I had asked "Why make this different from other 
> >> continuable
> >> sub-ops?"
> >>
> >
> > There's already a comment in the same file for similar reason
> >
> >     /*
> >      * Limiting nr_frames at (UINT_MAX >> MEMOP_EXTENT_SHIFT) isn't ideal.  
> > If
> >      * it ever becomes a practical problem, we can switch to mutating
> >      * xmar.{frame,nr_frames,frame_list} in guest memory.
> >      */
> >
> > so to avoid the doubt and possible future change I mutate the structure.
> > Also I use the mutation to give more information to the caller, using
> > "start_entent" won't allow this.
>
> You'll want to mention this in the description and/or a code comment. It
> wants to become clear that the inconsistency in behavior (with other sub-
> ops) is deliberate rather than accidental.
>

Added a comment in the code:

        if ( copy.nr_frames && hypercall_preempt_check() )
        {
            /*
             * Instead of using "start_extent" we update the structure back,
             * we update it back in anyway to tell caller were the copy
             * stopped.
             */
            rc = hypercall_create_continuation(
                __HYPERVISOR_memory_op, "lh", XENMEM_foreigncopy, arg);
            goto out;
        }

> >>> --- a/xen/include/public/memory.h
> >>> +++ b/xen/include/public/memory.h
> >>> @@ -740,7 +740,45 @@ struct xen_vnuma_topology_info {
> >>>  typedef struct xen_vnuma_topology_info xen_vnuma_topology_info_t;
> >>>  DEFINE_XEN_GUEST_HANDLE(xen_vnuma_topology_info_t);
> >>>
> >>> -/* Next available subop number is 29 */
> >>> +/*
> >>> + * Copy memory from/to a given domain.
> >>> + */
> >>> +#define XENMEM_foreigncopy 29
> >>> +struct xen_foreigncopy {
> >>> +    /* IN - The domain whose resource is to be copied. */
> >>
> >> There's still "resource" here, when this really is about RAM (memory) only,
> >> not any other kind of resource.
> >>
> >>> +    domid_t domid;
> >>> +
> >>> +    /* IN - Flags. */
> >>> +#define XENMEM_foreigncopy_from 0
> >>> +#define XENMEM_foreigncopy_to 1
> >>> +#define XENMEM_foreigncopy_direction 1
> >>> +    uint16_t flags;
> >>> +
> >>> +    /*
> >>> +     * IN
> >>> +     *
> >>> +     * As an IN parameter number of frames of the domain to be copied.
> >>> +     */
> >>> +    uint32_t nr_frames;
> >>
> >> This isn't just an input, as you update the field (and the handles below).
> >> This property of fields wants reflecting here, so callers know that they 
> >> (a)
> >> can't re-use the struct on a subsequent call without re-initializing the
> >> fields which may have changed, and (b) can't put the struct in r/o memory.
> >>
> >
> > Update comments:
> >
> > /*
> >  * Copy memory from/to a given domain.
> >  */
> > #define XENMEM_foreigncopy 29
> > struct xen_foreigncopy {
> >     /* IN - The domain whose memory is to be copied. */
> >     domid_t domid;
> >
> >     /* IN - Flags. */
> > #define XENMEM_foreigncopy_from 0
> > #define XENMEM_foreigncopy_to 1
> > #define XENMEM_foreigncopy_direction 1
> >     uint16_t flags;
> >
> >     /*
> >      * IN/OUT
> >      *
> >      * As an IN parameter number of frames of the domain to be copied.
> >      * On output on error updated number of frames left.
> >      */
> >     uint32_t nr_frames;
>
> This is updated not only on error, but also when encoding continuations.
>

Yes, but this seems more an implementation detail to me. I don't think
the caller cares about how the continuation is implemented.

> >>> +    /*
> >>> +     * IN
> >>> +     *
> >>> +     * Frames to be copied.
> >>> +     */
> >>> +    XEN_GUEST_HANDLE(xen_pfn_t) frame_list;
> >>> +
> >>> +    /*
> >>> +     * IN/OUT
> >>> +     *
> >>> +     * Userspace buffer to read/write from.
> >>> +     */
> >>> +    XEN_GUEST_HANDLE(uint8) buffer;
> >>
> >> With these two handles, there continues to be a need to (explicitly) deal
> >> with the compat case as well.
> >
> > I don't agree with this. Domains having access to other domains are
> > limited (like stub domains for Qemu) and won't be 32 bits today so why
> > allow 32 bits guests if not ever used?
>
> How do you know? Why shouldn't e.g. XTF be permitted to test this in all
> possible modes? And even if all arguments end up in favor of "no compat
> support", this then wants spelling out to make clear this wasn't an
> oversight, but rather a conscious decision.
>

XTF can do something like

#if COMPAT_GUEST
    /* Compat guests are not supported, return success. */
    return 0;
# endif

(or can be done in the Makefiles I suppose).

Added a comment in memory.h:

    /*
     * Copy memory from/to a given domain.
     * As this call requires target access and guest with target access won't be
     * compat guests supported for compat guests this is not implemented.
     */
    #define XENMEM_foreigncopy 29
    struct xen_foreigncopy {

> Jan

Frediano



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.