[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v3 07/15] argo: implement the register op



The register op is used by a domain to register a region of memory for
receiving messages from either a specified other domain, or, if specifying a
wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

Wildcard any-sender rings are default disabled and registration will be
refused with EPERM unless they have been specifically enabled with the
argo-mac boot option introduced here. The reason why the default for
wildcard rings is 'deny' is that there is currently no means to protect the
ring from DoS by a noisy domain spamming the ring, affecting other domains
ability to send to it. This will be addressed with XSM policy controls in
subsequent work.

Since denying access to any-sender rings is a significant functional
constraint, a new bootparam is provided to enable overriding this:
 "argo-mac" variable has allowed values: 'permissive' and 'enforcing'.
Even though this is a boolean variable, use these descriptive strings in
order to make it obvious to an administrator that this has potential
security impact.

The p2m type of the memory supplied by the guest for the ring must be
p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
is registered.

xen_argo_page_descr_t type is introduced as a page descriptor, to convey
both the physical address of the start of the page and its granularity. The
smallest granularity page is assumed to be 4096 bytes and the lower twelve
bits of the type are used to indicate the size of page of memory supplied.
The implementation of the hypercall op currently only supports 4K pages.

array_index_nospec is used to guard the result of the ring id hash function.
This is out of an abundance of caution, since this is a very basic hash
function and it operates upon values supplied by the guest just before
being used as an array index.

Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
---
v2 self: disallow ring resize via reregister
v2 feedback Jan: drop cookie, implement teardown
v2 feedback Jan: drop message from argo_message_op
v2 self: move hash_index function below locking comment
v2 self: OVERHAUL
v2 self/Jan: remove use of magic verification field and tidy up
v2 self: merge max and min ring size check clauses
v2 feedback v1#13 Roger: use OS-supplied roundup; drop from public header
v2 feedback #9, Jan: use the argo-mac bootparam at point of introduction
v2 feedback #9, Jan: rename boot opt variable to comply with convention
v2 feedback #9, Jan: rename the argo_mac bootparam to argo-mac
v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown
v1,2 feedback Jan/Roger/Paul: drop errno returning guest access functions
v1 feedback Roger, Jan: drop argo prefix on static functions
v1 feedback Roger: s/pfn/gfn/ and retire always-64-bit type
v2. feedback Jan: document the argo-mac boot opt
v2. feedback Jan: simplify re-register, drop mappings
v1 #13 feedback Jan: revise use of guest_handle_okay vs __copy ops

v1 #13 feedback, Jan: register op : s/ECONNREFUSED/ESRCH/
v1 #5 (#13) feedback Paul: register op: use currd in do_message_op
v1 #13 feedback, Paul: register op: use mfn_eq comparator
v1 #5 (#13) feedback Paul: register op: use currd in argo_register_ring
v1 #13 feedback Paul: register op: whitespace, unsigned, bounds check
v1 #13 feedback Paul: use of hex in limit constant definition
v1 #13 feedback Paul, register op: set nmfns on loop termination
v1 #13 feedback Paul: register op: do/while -> gotos, reindent
v1 argo_ring_map_page: drop uint32_t for unsigned int
v1. #13 feedback Julien: use page descriptors instead of gpfns.
   - adds ABI support for pages with different granularity.
v1 feedback #13, Paul: adjust log level of message
v1 feedback #13, Paul: use gprintk for guest-triggered warning
v1 feedback #13, Paul: gprintk and XENLOG_DEBUG for ring registration
v1 feedback #13, Paul: use gprintk for errs in argo_ring_map_page
v1 feedback #13, Paul: use ENOMEM if global mapping fails
v1 feedback Paul: overflow check before shift
v1: add define for copy_field_to_guest_errno
v1: fix gprintk use for ARM as its defn dislikes split format strings
v1: use copy_field_to_guest_errno
v1 feedback #13, Jan: argo_hash_fn: no inline, rename, change type
v1 feedback #13, Paul, Jan: EFAULT -> ENOMEM in argo_ring_map_page
v1 feedback #13, Jan: rename page var in argo_ring_map_page
v1 feedback #13, Jan: switch uint8_t* to void* and drop cast
v1 feedback #13, Jan: switch memory barrier to smp_wmb
v1 feedback #13, Jan: make 'ring' comment comply with single-line style
v1 feedback #13, Jan: use xzalloc_array, drop loop NULL init
v1 feedback #13, Jan: init bool with false rather than 0
v1 feedback #13 Jan: use __copy; define and use __copy_field_to_guest_errno
v1 feedback #13, Jan: use xzalloc, drop individual init zeroes
v1 feedback #13, Jan: prefix public namespace with xen
v1 feedback #13, Jan: blank line after op case in do_argo_message_op
v1 self: reflow comment in argo_ring_map_page to within 80 char len
v1 feedback #13, Roger: use true not 1 in assign to update_tx_ptr bool
v1 feedback #21, Jan: fold in the array_index_nospec hash function guards
v1 feedback #18, Jan: fold the max ring count limit into the series
v1 self: use unsigned long type for XEN_ARGO_REGISTER_FLAG_MASK
v1: feedback #15 Jan: handle upper-halves of hypercall args
v1. feedback #13 Jan: add comment re: page alignment
v1. self: confirm ring magic presence in supplied page array
v1. feedback #13 Jan: add comment re: minimum ring size
v1. feedback #13 Roger: use ASSERT_UNREACHABLE
v1. feedback Roger: add comment to hash function

 docs/misc/xen-command-line.pandoc  |  15 +
 xen/common/argo.c                  | 566 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/guest_access.h |   2 +
 xen/include/asm-x86/guest_access.h |   2 +
 xen/include/public/argo.h          |  72 +++++
 xen/include/xlat.lst               |   1 +
 6 files changed, 658 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index aea13eb..68d4415 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -193,6 +193,21 @@ This allows domains access to the Argo hypercall, which 
supports registration
 of memory rings with the hypervisor to receive messages, sending messages to
 other domains by hypercall and querying the ring status of other domains.
 
+### argo-mac
+> `= permissive | enforcing`
+
+> Default: `enforcing`
+
+Constrain the access control applied to the Argo communication mechanism.
+
+When `enforcing`, domains may not register rings that have wildcard specified
+for the sender which would allow messages to be sent to the ring by any domain.
+This is to protect rings and the services that utilize them against DoS by a
+malicious or buggy domain spamming the ring.
+
+When the boot option is set to `permissive`, this constraint is relaxed and
+wildcard any-sender rings are allowed to be registered.
+
 ### asid (x86)
 > `= <boolean>`
 
diff --git a/xen/common/argo.c b/xen/common/argo.c
index 86195d3..11988e7 100644
--- a/xen/common/argo.c
+++ b/xen/common/argo.c
@@ -23,16 +23,41 @@
 #include <xen/event.h>
 #include <xen/domain_page.h>
 #include <xen/guest_access.h>
+#include <xen/lib.h>
+#include <xen/nospec.h>
 #include <xen/time.h>
 #include <public/argo.h>
 
+#define MAX_RINGS_PER_DOMAIN            128U
+
+/* All messages on the ring are padded to a multiple of the slot size. */
+#define ROUNDUP_MESSAGE(a) (ROUNDUP((a), XEN_ARGO_MSG_SLOT_SIZE))
+
 DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t);
+DEFINE_XEN_GUEST_HANDLE(xen_argo_register_ring_t);
 DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t);
 
 /* Xen command line option to enable argo */
 static bool __read_mostly opt_argo_enabled;
 boolean_param("argo", opt_argo_enabled);
 
+/* Xen command line option for conservative or relaxed access control */
+bool __read_mostly opt_argo_mac_enforcing = true;
+
+static int __init parse_opt_argo_mac(const char *s)
+{
+    if ( !strcmp(s, "enforcing") )
+        opt_argo_mac_enforcing = true;
+    else if ( !strcmp(s, "permissive") )
+        opt_argo_mac_enforcing = false;
+    else
+        return -EINVAL;
+
+    return 0;
+}
+custom_param("argo-mac", parse_opt_argo_mac);
+
 typedef struct argo_ring_id
 {
     uint32_t port;
@@ -198,6 +223,31 @@ static DEFINE_RWLOCK(argo_lock); /* L1 */
 #define argo_dprintk(format, ... ) ((void)0)
 #endif
 
+/*
+ * This hash function is used to distribute rings within the per-domain
+ * hash tables (d->argo->ring_hash and d->argo_send_hash). The hash table
+ * will provide a struct if a match is found with a 'argo_ring_id' key:
+ * ie. the key is a (domain id, port, partner domain id) tuple.
+ * Since port number varies the most in expected use, and the Linux driver
+ * allocates at both the high and low ends, incorporate high and low bits to
+ * help with distribution.
+ * Apply array_index_nospec as a defensive measure since this operates
+ * on user-supplied input and the array size that it indexes into is known.
+ */
+static unsigned int
+hash_index(const struct argo_ring_id *id)
+{
+    unsigned int hash;
+
+    hash = (uint16_t)(id->port >> 16);
+    hash ^= (uint16_t)id->port;
+    hash ^= id->domain_id;
+    hash ^= id->partner_id;
+    hash &= (ARGO_HTABLE_SIZE - 1);
+
+    return array_index_nospec(hash, ARGO_HTABLE_SIZE);
+}
+
 static void
 ring_unmap(struct argo_ring_info *ring_info)
 {
@@ -219,6 +269,78 @@ ring_unmap(struct argo_ring_info *ring_info)
     }
 }
 
+static int
+ring_map_page(struct argo_ring_info *ring_info, unsigned int i, void **out_ptr)
+{
+    if ( i >= ring_info->nmfns )
+    {
+        gprintk(XENLOG_ERR,
+               "argo: ring (vm%u:%x vm%d) %p attempted to map page  %u of 
%u\n",
+                ring_info->id.domain_id, ring_info->id.port,
+                ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfns || !ring_info->mfn_mapping)
+    {
+        ASSERT_UNREACHABLE();
+        ring_info->len = 0;
+        return -ENOMEM;
+    }
+
+    if ( !ring_info->mfn_mapping[i] )
+    {
+        /*
+         * TODO:
+         * The first page of the ring contains the ring indices, so both read
+         * and write access to the page is required by the hypervisor, but
+         * read-access is not needed for this mapping for the remainder of the
+         * ring.
+         * Since this mapping will remain resident in Xen's address space for
+         * the lifetime of the ring, and following the principle of least
+         * privilege, it could be preferable to:
+         *  # add a XSM check to determine what policy is wanted here
+         *  # depending on the XSM query, optionally create this mapping as
+         *    _write-only_ on platforms that can support it.
+         *    (eg. Intel EPT/AMD NPT).
+         */
+        ring_info->mfn_mapping[i] = map_domain_page_global(ring_info->mfns[i]);
+
+        if ( !ring_info->mfn_mapping[i] )
+        {
+            gprintk(XENLOG_ERR,
+                "argo: ring (vm%u:%x vm%d) %p attempted to map page %u of 
%u\n",
+                    ring_info->id.domain_id, ring_info->id.port,
+                    ring_info->id.partner_id, ring_info, i, ring_info->nmfns);
+            return -ENOMEM;
+        }
+        argo_dprintk("mapping page %"PRI_mfn" to %p\n",
+                     mfn_x(ring_info->mfns[i]), ring_info->mfn_mapping[i]);
+    }
+
+    if ( out_ptr )
+        *out_ptr = ring_info->mfn_mapping[i];
+
+    return 0;
+}
+
+static void
+update_tx_ptr(struct argo_ring_info *ring_info, uint32_t tx_ptr)
+{
+    void *dst;
+    uint32_t *p;
+
+    ASSERT(ring_info->mfn_mapping[0]);
+
+    ring_info->tx_ptr = tx_ptr;
+
+    dst = ring_info->mfn_mapping[0];
+    p = dst + offsetof(xen_argo_ring_t, tx_ptr);
+
+    write_atomic(p, tx_ptr);
+    smp_wmb();
+}
+
 static void
 wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent)
 {
@@ -371,6 +493,418 @@ partner_rings_remove(struct domain *src_d)
     }
 }
 
+static int
+find_ring_mfn(struct domain *d, gfn_t gfn, mfn_t *mfn)
+{
+    p2m_type_t p2mt;
+    int ret = 0;
+
+#ifdef CONFIG_X86
+    *mfn = get_gfn_unshare(d, gfn_x(gfn), &p2mt);
+#else
+    *mfn = p2m_lookup(d, gfn, &p2mt);
+#endif
+
+    if ( !mfn_valid(*mfn) )
+        ret = -EINVAL;
+#ifdef CONFIG_X86
+    else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
+        ret = -EAGAIN;
+#endif
+    else if ( (p2mt != p2m_ram_rw) ||
+              !get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page) )
+        ret = -EINVAL;
+
+#ifdef CONFIG_X86
+    put_gfn(d, gfn_x(gfn));
+#endif
+
+    return ret;
+}
+
+static int
+find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
+               uint32_t npage,
+               XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+               uint32_t len)
+{
+    unsigned int i;
+    int ret = 0;
+    mfn_t *mfns;
+    uint8_t **mfn_mapping;
+
+    /*
+     * first bounds check on npage here also serves as an overflow check
+     * before left shifting it
+     */
+    if ( (unlikely(npage > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT))) ||
+         ((npage << PAGE_SHIFT) < len) )
+        return -EINVAL;
+
+    if ( ring_info->mfns )
+    {
+        /* Ring already existed: drop the previous mapping. */
+        gprintk(XENLOG_INFO,
+         "argo: vm%u re-register existing ring (vm%u:%x vm%d) clears 
mapping\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.port, ring_info->id.partner_id);
+
+        ring_remove_mfns(d, ring_info);
+        ASSERT(!ring_info->mfns);
+    }
+
+    mfns = xmalloc_array(mfn_t, npage);
+    if ( !mfns )
+        return -ENOMEM;
+
+    for ( i = 0; i < npage; i++ )
+        mfns[i] = INVALID_MFN;
+
+    mfn_mapping = xzalloc_array(uint8_t *, npage);
+    if ( !mfn_mapping )
+    {
+        xfree(mfns);
+        return -ENOMEM;
+    }
+
+    ring_info->npage = npage;
+    ring_info->mfns = mfns;
+    ring_info->mfn_mapping = mfn_mapping;
+
+    ASSERT(ring_info->npage == npage);
+
+    if ( ring_info->nmfns == ring_info->npage )
+        return 0;
+
+    for ( i = ring_info->nmfns; i < ring_info->npage; i++ )
+    {
+        xen_argo_page_descr_t pg_descr;
+        gfn_t gfn;
+        mfn_t mfn;
+
+        ret = __copy_from_guest_offset(&pg_descr, pg_descr_hnd, i, 1) ?
+                -EFAULT : 0;
+        if ( ret )
+            break;
+
+        /* Implementation currently only supports handling 4K pages */
+        if ( (pg_descr & XEN_ARGO_PAGE_DESCR_SIZE_MASK) !=
+                XEN_ARGO_PAGE_DESCR_SIZE_4K )
+        {
+            ret = -EINVAL;
+            break;
+        }
+        gfn = _gfn(pg_descr >> PAGE_SHIFT);
+
+        ret = find_ring_mfn(d, gfn, &mfn);
+        if ( ret )
+        {
+            gprintk(XENLOG_ERR,
+               "argo: vm%u: invalid gfn %"PRI_gfn" r:(vm%u:%x vm%d) %p 
%d/%d\n",
+                    d->domain_id, gfn_x(gfn), ring_info->id.domain_id,
+                    ring_info->id.port, ring_info->id.partner_id,
+                    ring_info, i, ring_info->npage);
+            break;
+        }
+
+        ring_info->mfns[i] = mfn;
+
+        argo_dprintk("%d: %"PRI_gfn" -> %"PRI_mfn"\n",
+                     i, gfn_x(gfn), mfn_x(ring_info->mfns[i]));
+    }
+
+    ring_info->nmfns = i;
+
+    if ( ret )
+        ring_remove_mfns(d, ring_info);
+    else
+    {
+        ASSERT(ring_info->nmfns == ring_info->npage);
+
+        gprintk(XENLOG_DEBUG,
+        "argo: vm%u ring (vm%u:%x vm%d) %p mfn_mapping %p npage %d nmfns %d\n",
+                d->domain_id, ring_info->id.domain_id,
+                ring_info->id.port, ring_info->id.partner_id, ring_info,
+                ring_info->mfn_mapping, ring_info->npage, ring_info->nmfns);
+    }
+
+    return ret;
+}
+
+static struct argo_ring_info *
+ring_find_info(const struct domain *d, const struct argo_ring_id *id)
+{
+    unsigned int ring_hash_index;
+    struct hlist_node *node;
+    struct argo_ring_info *ring_info;
+
+    ASSERT(rw_is_locked(&d->argo->lock));
+
+    ring_hash_index = hash_index(id);
+
+    argo_dprintk("d->argo=%p, d->argo->ring_hash[%u]=%p id=%p\n",
+                 d->argo, ring_hash_index,
+                 d->argo->ring_hash[ring_hash_index].first, id);
+    argo_dprintk("id.port=%x id.domain=vm%u id.partner_id=vm%d\n",
+                 id->port, id->domain_id, id->partner_id);
+
+    hlist_for_each_entry(ring_info, node, &d->argo->ring_hash[ring_hash_index],
+                         node)
+    {
+        struct argo_ring_id *cmpid = &ring_info->id;
+
+        if ( cmpid->port == id->port &&
+             cmpid->domain_id == id->domain_id &&
+             cmpid->partner_id == id->partner_id )
+        {
+            argo_dprintk("ring_info=%p\n", ring_info);
+            return ring_info;
+        }
+    }
+    argo_dprintk("no ring_info found\n");
+
+    return NULL;
+}
+
+static long
+register_ring(struct domain *currd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
+              XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd,
+              uint32_t npage, bool fail_exist)
+{
+    xen_argo_register_ring_t reg;
+    struct argo_ring_id ring_id;
+    void *map_ringp;
+    xen_argo_ring_t *ringp;
+    struct argo_ring_info *ring_info;
+    struct argo_send_info *send_info = NULL;
+    struct domain *dst_d = NULL;
+    int ret = 0;
+    uint32_t private_tx_ptr;
+
+    if ( copy_from_guest(&reg, reg_hnd, 1) )
+    {
+        ret = -EFAULT;
+        goto out;
+    }
+
+    /*
+     * A ring must be large enough to transmit messages, so requires space for:
+     * * 1 message header, plus
+     * * 1 payload slot (payload is always rounded to a multiple of 16 bytes)
+     *   for the message payload to be written into, plus
+     * * 1 more slot, so that the ring cannot be filled to capacity with a
+     *   single message -- see the logic in ringbuf_insert -- allowing for this
+     *   ensures that there can be space remaining when a message is present.
+     * The above determines the minimum acceptable ring size.
+     */
+    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
+                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
+         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
+         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
+         (reg.pad != 0) )
+    {
+        ret = -EINVAL;
+        goto out;
+    }
+
+    ring_id.partner_id = reg.partner_id;
+    ring_id.port = reg.port;
+    ring_id.domain_id = currd->domain_id;
+
+    read_lock(&argo_lock);
+
+    if ( !currd->argo )
+    {
+        ret = -ENODEV;
+        goto out_unlock;
+    }
+
+    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
+    {
+        if ( opt_argo_mac_enforcing )
+        {
+            ret = -EPERM;
+            goto out_unlock;
+        }
+    }
+    else
+    {
+        dst_d = get_domain_by_id(reg.partner_id);
+        if ( !dst_d )
+        {
+            argo_dprintk("!dst_d, ESRCH\n");
+            ret = -ESRCH;
+            goto out_unlock;
+        }
+
+        if ( !dst_d->argo )
+        {
+            argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
+            ret = -ECONNREFUSED;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+
+        send_info = xzalloc(struct argo_send_info);
+        if ( !send_info )
+        {
+            ret = -ENOMEM;
+            put_domain(dst_d);
+            goto out_unlock;
+        }
+        send_info->id = ring_id;
+    }
+
+    write_lock(&currd->argo->lock);
+
+    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
+    {
+        ret = -ENOSPC;
+        goto out_unlock2;
+    }
+
+    ring_info = ring_find_info(currd, &ring_id);
+    if ( !ring_info )
+    {
+        ring_info = xzalloc(struct argo_ring_info);
+        if ( !ring_info )
+        {
+            ret = -ENOMEM;
+            goto out_unlock2;
+        }
+
+        spin_lock_init(&ring_info->lock);
+
+        ring_info->id = ring_id;
+        INIT_HLIST_HEAD(&ring_info->pending);
+
+        hlist_add_head(&ring_info->node,
+                       &currd->argo->ring_hash[hash_index(&ring_info->id)]);
+
+        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x vm%d)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.port,
+                ring_id.partner_id);
+    }
+    else
+    {
+        if ( ring_info->len )
+        {
+            /*
+             * If the caller specified that the ring must not already exist,
+             * fail at attempt to add a completed ring which already exists.
+             */
+            if ( fail_exist )
+            {
+                argo_dprintk("disallowed reregistration of existing ring\n");
+                ret = -EEXIST;
+                goto out_unlock2;
+            }
+
+            if ( ring_info->len != reg.len )
+            {
+                /*
+                 * Change of ring size could result in entries on the pending
+                 * notifications list that will never trigger.
+                 * Simple blunt solution: disallow ring resize for now.
+                 * TODO: investigate enabling ring resize.
+                 */
+                gprintk(XENLOG_ERR,
+                    "argo: vm%u attempted to change ring size(vm%u:%x vm%d)\n",
+                        currd->domain_id, ring_id.domain_id, ring_id.port,
+                        ring_id.partner_id);
+                /*
+                 * Could return EINVAL here, but if the ring didn't already
+                 * exist then the arguments would have been valid, so: EEXIST.
+                 */
+                ret = -EEXIST;
+                goto out_unlock2;
+            }
+
+            gprintk(XENLOG_DEBUG,
+                    "argo: vm%u re-registering existing ring (vm%u:%x vm%d)\n",
+                    currd->domain_id, ring_id.domain_id, ring_id.port,
+                    ring_id.partner_id);
+        }
+    }
+
+    ret = find_ring_mfns(currd, ring_info, npage, pg_descr_hnd, reg.len);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to find ring mfns (vm%u:%x vm%d)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.port,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+
+    /*
+     * The first page of the memory supplied for the ring has the xen_argo_ring
+     * structure at its head, which is where the ring indexes reside.
+     */
+    ret = ring_map_page(ring_info, 0, &map_ringp);
+    if ( ret )
+    {
+        gprintk(XENLOG_ERR,
+                "argo: vm%u failed to map ring mfn 0 (vm%u:%x vm%d)\n",
+                currd->domain_id, ring_id.domain_id, ring_id.port,
+                ring_id.partner_id);
+
+        ring_remove_info(currd, ring_info);
+        goto out_unlock2;
+    }
+    ringp = map_ringp;
+
+    private_tx_ptr = read_atomic(&ringp->tx_ptr);
+
+    if ( (private_tx_ptr >= reg.len) ||
+         (ROUNDUP_MESSAGE(private_tx_ptr) != private_tx_ptr) )
+    {
+        /*
+         * Since the ring is a mess, attempt to flush the contents of it
+         * here by setting the tx_ptr to the next aligned message slot past
+         * the latest rx_ptr we have observed. Handle ring wrap correctly.
+         */
+        private_tx_ptr = ROUNDUP_MESSAGE(read_atomic(&ringp->rx_ptr));
+
+        if ( private_tx_ptr >= reg.len )
+            private_tx_ptr = 0;
+
+        update_tx_ptr(ring_info, private_tx_ptr);
+    }
+
+    ring_info->tx_ptr = private_tx_ptr;
+    ring_info->len = reg.len;
+    currd->argo->ring_count++;
+
+    if ( send_info )
+    {
+        spin_lock(&dst_d->argo->send_lock);
+
+        hlist_add_head(&send_info->node,
+                       &dst_d->argo->send_hash[hash_index(&send_info->id)]);
+
+        spin_unlock(&dst_d->argo->send_lock);
+    }
+
+ out_unlock2:
+    if ( !ret && send_info )
+        xfree(send_info);
+
+    if ( dst_d )
+        put_domain(dst_d);
+
+    write_unlock(&currd->argo->lock);
+
+ out_unlock:
+    read_unlock(&argo_lock);
+
+ out:
+    return ret;
+}
+
 long
 do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1,
            XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3,
@@ -392,6 +926,38 @@ do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg1,
 
     switch (cmd)
     {
+    case XEN_ARGO_OP_register_ring:
+    {
+        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
+            guest_handle_cast(arg1, xen_argo_register_ring_t);
+        XEN_GUEST_HANDLE_PARAM(xen_argo_page_descr_t) pg_descr_hnd =
+            guest_handle_cast(arg2, xen_argo_page_descr_t);
+        /* arg3 is npage */
+        /* arg4 is flags */
+        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
+
+        if ( unlikely(arg3 > (XEN_ARGO_MAX_RING_SIZE >> PAGE_SHIFT)) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+        /*
+         * Check access to the whole array here so we can use the faster __copy
+         * operations to read each element later.
+         */
+        if ( unlikely(!guest_handle_okay(pg_descr_hnd, arg3)) )
+            break;
+        /* arg4: reserve currently-undefined bits, require zero.  */
+        if ( unlikely(arg4 & ~XEN_ARGO_REGISTER_FLAG_MASK) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        rc = register_ring(currd, reg_hnd, pg_descr_hnd, arg3, fail_exist);
+        break;
+    }
+
     default:
         rc = -EOPNOTSUPP;
         break;
diff --git a/xen/include/asm-arm/guest_access.h 
b/xen/include/asm-arm/guest_access.h
index 8997a1c..70e9a78 100644
--- a/xen/include/asm-arm/guest_access.h
+++ b/xen/include/asm-arm/guest_access.h
@@ -29,6 +29,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, 
void *buf,
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
diff --git a/xen/include/asm-x86/guest_access.h 
b/xen/include/asm-x86/guest_access.h
index ca700c9..8dde5d5 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -41,6 +41,8 @@
 /* Is the guest handle a NULL reference? */
 #define guest_handle_is_null(hnd)        ((hnd).p == NULL)
 
+#define guest_handle_is_aligned(hnd, mask) (!((uintptr_t)(hnd).p & (mask)))
+
 /* Offset the given guest handle into the array it refers to. */
 #define guest_handle_add_offset(hnd, nr) ((hnd).p += (nr))
 #define guest_handle_subtract_offset(hnd, nr) ((hnd).p -= (nr))
diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h
index 4818684..8947230 100644
--- a/xen/include/public/argo.h
+++ b/xen/include/public/argo.h
@@ -31,6 +31,26 @@
 
 #include "xen.h"
 
+#define XEN_ARGO_DOMID_ANY       DOMID_INVALID
+
+/*
+ * The maximum size of an Argo ring is defined to be: 16GB
+ *  -- which is 0x1000000 bytes.
+ * A byte index into the ring is at most 24 bits.
+ */
+#define XEN_ARGO_MAX_RING_SIZE  (0x1000000ULL)
+
+/*
+ * Page descriptor: encoding both page address and size in a 64-bit value.
+ * Intended to allow ABI to support use of different granularity pages.
+ * example of how to populate:
+ * xen_argo_page_descr_t pg_desc =
+ *      (physaddr & PAGE_MASK) | XEN_ARGO_PAGE_DESCR_SIZE_4K;
+ */
+typedef uint64_t xen_argo_page_descr_t;
+#define XEN_ARGO_PAGE_DESCR_SIZE_MASK   0x0000000000000fffULL
+#define XEN_ARGO_PAGE_DESCR_SIZE_4K     0
+
 typedef struct xen_argo_addr
 {
     uint32_t port;
@@ -56,4 +76,56 @@ typedef struct xen_argo_ring
 #endif
 } xen_argo_ring_t;
 
+typedef struct xen_argo_register_ring
+{
+    uint32_t port;
+    domid_t partner_id;
+    uint16_t pad;
+    uint32_t len;
+} xen_argo_register_ring_t;
+
+/* Messages on the ring are padded to a multiple of this size. */
+#define XEN_ARGO_MSG_SLOT_SIZE 0x10
+
+struct xen_argo_ring_message_header
+{
+    uint32_t len;
+    xen_argo_addr_t source;
+    uint32_t message_type;
+#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+    uint8_t data[];
+#elif defined(__GNUC__)
+    uint8_t data[0];
+#endif
+};
+
+/*
+ * Hypercall operations
+ */
+
+/*
+ * XEN_ARGO_OP_register_ring
+ *
+ * Register a ring using the indicated memory.
+ * Also used to reregister an existing ring (eg. after resume from hibernate).
+ *
+ * arg1: XEN_GUEST_HANDLE(xen_argo_register_ring_t)
+ * arg2: XEN_GUEST_HANDLE(xen_argo_page_descr_t)
+ * arg3: unsigned long npages
+ * arg4: unsigned long flags
+ */
+#define XEN_ARGO_OP_register_ring     1
+
+/* Register op flags */
+/*
+ * Fail exist:
+ * If set, reject attempts to (re)register an existing established ring.
+ * If clear, reregistration occurs if the ring exists, with the new ring
+ * taking the place of the old, preserving tx_ptr if it remains valid.
+ */
+#define XEN_ARGO_REGISTER_FLAG_FAIL_EXIST  0x1
+
+/* Mask for all defined flags. unsigned long type so ok for both 32/64-bit */
+#define XEN_ARGO_REGISTER_FLAG_MASK 0x1UL
+
 #endif
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 9f616e4..9c9d33f 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -150,3 +150,4 @@
 ?      flask_transition                xsm/flask_op.h
 ?      argo_addr                       argo.h
 ?      argo_ring                       argo.h
+?      argo_register_ring              argo.h
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.