[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 13/25] argo: implement the register op



On Wed, Dec 12, 2018 at 8:48 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Fri, Nov 30, 2018 at 05:32:52PM -0800, Christopher Clark wrote:
> > +static inline uint16_t
> > +argo_hash_fn(const struct argo_ring_id *id)
>
> No need for the argo_ prefix for static functions, this is already an
> argo specific file.

Although the compiler could live without the prefix, I'm finding it helpful to
very easily determine that functions being used are not defined elsewhere
within Xen; so I've left the prefix as is for version two of this series.

> > +{
> > +    uint16_t ret;
> > +
> > +    ret = (uint16_t)(id->addr.port >> 16);
> > +    ret ^= (uint16_t)id->addr.port;
> > +    ret ^= id->addr.domain_id;
> > +    ret ^= id->partner;
> > +
> > +    ret &= (ARGO_HTABLE_SIZE - 1);
>
> I'm having trouble figuring out what this is supposed to do, I think a
> comment and the expected hash formula will help make sure the code is
> correct.

Fair point. I've added a comment with explanation in the next version.

+ * This hash function is used to distribute rings within the per-domain
+ * hash table (d->argo->ring_hash). The hash table will provide a
+ * 'ring_info' struct if a match is found with a 'xen_argo_ring_id' key:
+ * ie. the key is a (domain id, port, partner domain id) tuple.
+ * There aren't many hash table buckets, and this doesn't need to be
+ * cryptographically robust. Since port number varies the most in
+ * expected use, and the Linux driver allocates at both the high and
+ * low ends, incorporate high and low bits to help with distribution.

> Also doesn't this need to be documented in the public header?

No, it's only for internal use within the hypervisor and designed to
meet the hypervisor's hashing requirement with a small table.

> > +    ASSERT(ring_info->mfns);
> > +    ASSERT(ring_info->mfn_mapping);
>
> We are trying to move away from such assertions, and instead use
> constructions that would prevent issues in non-debug builds. I would
> write the above asserts as:
>
> if ( !ring_info->mfns || !ring_info->mfn_mapping )
> {
>     ASSERT_UNREACHABLE();
>     return -E<something>;
> }
>
> That way non-debug builds won't trigger page faults if there's indeed
> a way to get here with the wrong state, and debug builds will still
> hit an assert.

ack, will do so in v2.

> > +    *mfn = get_gfn_unshare(d, pfn, &p2mt);
>
> Is this supposed to work for PV guests?

Yes -- and they seem to work OK. Am I missing something?

> > +#else
> > +    *mfn = p2m_lookup(d, _gfn(pfn), &p2mt);
> > +#endif
> > +
> > +    if ( !mfn_valid(*mfn) )
> > +        ret = -EINVAL;
> > +#ifdef CONFIG_X86
> > +    else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
> > +        ret = -EAGAIN;
> > +#endif
> > +    else if ( (p2mt != p2m_ram_rw) ||
> > +              !get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page) )
> > +        ret = -EINVAL;
> > +
> > +#ifdef CONFIG_X86
> > +    put_gfn(d, pfn);
>
> If you do this put_gfn here, by the time you check that the gfn -> mfn
> matches your expectations the guest might have somehow changed the gfn
> -> mfn mapping already (for example by ballooning down memory?)

If the guest does that, I think it only harms itself. If for some reason
a memory access is denied, then the op would just fail. I don't think
there's a more serious consequence to be worried about.

Above, if we're going to use the mfn, then we've just done a successful:
    get_page_and_type(mfn_to_page(*mfn), d, PGT_writable_page)

which should hold it in a state that we're ok with until we're done
with it -- see put_page_and_type in argo_ring_remove_mfns.

> > +        /* W(L2) protects all the elements of the domain's ring_info */
> > +        write_lock(&d->argo->lock);
>
> I don't understand this W(L2) nomenclature, is this explain somewhere?

Yes, sort of. Lock "L2" is the per-domain argo lock, identified in a
comment near the top of the file. It's a read-write lock, so 'W' means:
take the write lock on it.

> Also there's no such comment when you take the global argo_lock above.

L2 covers more interesting work than L1, which is why there are more
comments pertaining to it than L1.

> > +/*
> > + * Messages on the ring are padded to 128 bits
> > + * Len here refers to the exact length of the data not including the
> > + * 128 bit header. The message uses
> > + * ((len + 0xf) & ~0xf) + sizeof(argo_ring_message_header) bytes.
> > + * Using typeof(a) make clear that this does not truncate any high-order 
> > bits.
> > + */
> > +#define ARGO_ROUNDUP(a) (((a) + 0xf) & ~(typeof(a))0xf)
>
> Why not just use ROUNDUP?
>
> And in any case this shouldn't be on the public header IMO, since it's
> not part of the interface AFAICT.

Well, in version two it's now: XEN_ARGO_ROUNDUP :-)
because it does need to be in the public header because it's used within the
Linux device driver, and items in that public Xen header need the 'xen' prefix
(so they now do).  Within the Linux code, it's used to choose a sensible ring
size, and also used when manipulating the rx_ptr on the guest side.

Thanks for the feedback - almost all should be covered in v2, I think.

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.