[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 03/13] xen-netback: implement TX persistent grants



On 19 May 2015, at 17:23, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:
> On Tue, May 12, 2015 at 07:18:27PM +0200, Joao Martins wrote:
>> Introduces persistent grants for TX path which follows similar code path
>> as the grant mapping.
>> 
>> It starts by checking if there's a persistent grant available for header
>> and frags grefs and if so setting it in tx_pgrants. If no persistent grant
>> is found in the tree for the header it will resort to grant copy (but
>> preparing the map ops and add them laster). For the frags it will use the
>                                     ^
>                                     later
> 
>> tree page pool, and in case of no pages it fallbacks to grant map/unmap
>> using mmap_pages. When skb destructor callback gets called we release the
>> slot and persistent grant within the callback to avoid waking up the
>> dealloc thread. As long as there are no unmaps to done the dealloc thread
>> will remain inactive.
>> 
> 
> This scheme looks complicated. Can we just only use one
> scheme at a time? What's the rationale for using this combined scheme?
> Maybe you're thinking about using a max_grants < ring_size to save
> memory?

Yes, my purpose was to allow a max_grants < ring_size to save amount of
memory mapped. I did a bulk transfer test with iperf and the max amount of
grants in tree was <160 TX gnts, without affecting the max performance;
tough using pktgen fills the tree completely.
The second reason is to handle the case for a (malicious?) frontend providing
more grefs than the max allowed in which I would fallback to grant map/unmap.

> 
> Only skim the patch. I will do detailed reviews after we're sure this is
> the right way to go.
> 
>> Results show an improvement of 46% (1.82 vs 1.24 Mpps, 64 pkt size)
>> measured with pktgen and up to over 48% (21.6 vs 14.5 Gbit/s) measured
>> with iperf (TCP) with 4 parallel flows 1 queue vif, DomU to Dom0.
>> Tests ran on a Intel Xeon E5-1650 v2 with HT disabled.
>> […]
>> int xenvif_init_queue(struct xenvif_queue *queue)
>> {
>>      int err, i;
>> @@ -496,9 +524,23 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>>                        .ctx = NULL,
>>                        .desc = i };
>>              queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
>> +            queue->tx_pgrants[i] = NULL;
>> +    }
>> +
>> +    if (queue->vif->persistent_grants) {
>> +            err = init_persistent_gnt_tree(&queue->tx_gnts_tree,
>> +                                           queue->tx_gnts_pages,
>> +                                           XEN_NETIF_TX_RING_SIZE);
>> +            if (err)
>> +                    goto err_disable;
>>      }
>> 
>>      return 0;
>> +
>> +err_disable:
>> +    netdev_err(queue->vif->dev, "Could not reserve tree pages.");
>> +    queue->vif->persistent_grants = 0;
> 
> You can just move the above two lines under `if (err)'.
> 
> Also see below.
In the next patch this is also common cleanup path. I did it this way to 
avoid moving this hunk around.


>> +    return 0;
>> }
>> 
>> void xenvif_carrier_on(struct xenvif *vif)
>> @@ -654,6 +696,10 @@ void xenvif_disconnect(struct xenvif *vif)
>>              }
>> 
>>              xenvif_unmap_frontend_rings(queue);
>> +
>> +            if (queue->vif->persistent_grants)
>> +                    deinit_persistent_gnt_tree(&queue->tx_gnts_tree,
>> +                                               queue->tx_gnts_pages);
> 
> If the init function fails on queue N (N>0) you now leak resources.
Correct. I should return -ENOMEM (on init) and not 0.


>>      }
>> }
>> 
>> diff --git a/drivers/net/xen-netback/netback.c 
>> b/drivers/net/xen-netback/netback.c
>> index 332e489..529d7c3 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -269,6 +269,11 @@ static inline unsigned long idx_to_kaddr(struct 
>> xenvif_queue *queue,
>>      return (unsigned long)pfn_to_kaddr(idx_to_pfn(queue, idx));
>> }
>> 
>> +static inline void *page_to_kaddr(struct page *page)
>> +{
>> +    return pfn_to_kaddr(page_to_pfn(page));
>> +}
>> +
>> #define callback_param(vif, pending_idx) \
>>      (vif->pending_tx_info[pending_idx].callback_struct)
>> 
>> @@ -299,6 +304,29 @@ static inline pending_ring_idx_t pending_index(unsigned 
>> i)
>>      return i & (MAX_PENDING_REQS-1);
>> }
>> 
>> +/*  Creates a new persistent grant and add it to the tree.
>> + */
>> +static struct persistent_gnt *xenvif_pgrant_new(struct persistent_gnt_tree 
>> *tree,
>> +                                            struct gnttab_map_grant_ref 
>> *gop)
>> +{
>> +    struct persistent_gnt *persistent_gnt;
>> +
>> +    persistent_gnt = kmalloc(sizeof(*persistent_gnt), GFP_KERNEL);
> 
> xenvif_pgrant_new can be called in NAPI which runs in softirq context
> which doesn't allow you to sleep.

Silly mistake, I will fix this. The whole point of the tree page pool was that 
we
aren't allowed to sleep both here and in ndo_start_xmit (in patch #6).

>> […]
>> 
>> +/* Checks if there's a persistent grant available for gref and
>> + * if so, set it also in the tx_pgrants array that keeps the ones
>> + * in use.
>> + */
>> +static bool xenvif_tx_pgrant_available(struct xenvif_queue *queue,
>> +                                   grant_ref_t ref, u16 pending_idx,
>> +                                   bool *can_map)
>> +{
>> +    struct persistent_gnt_tree *tree = &queue->tx_gnts_tree;
>> +    struct persistent_gnt *persistent_gnt;
>> +    bool busy;
>> +
>> +    if (!queue->vif->persistent_grants)
>> +            return false;
>> +
>> +    persistent_gnt = get_persistent_gnt(tree, ref);
>> +
>> +    /* If gref is already in use we fallback, since it would
>> +     * otherwise mean re-adding the same gref to the tree
>> +     */
>> +    busy = IS_ERR(persistent_gnt);
>> +    if (unlikely(busy))
>> +            persistent_gnt = NULL;
>> +
> 
> Under what circumstance can we retrieve a already in use persistent
> grant? You seem to suggest this is a bug in RX case.

A guest could share try to share the same mapped page in multiple frags,
in which case I fallback to map/unmap. I think this is a limitation in
the way we manage the persistent gnts where we can only have a single
reference of a persistent grant inflight.

>> […]
>>                      MAX_PENDING_REQS);
>>              index = pending_index(queue->dealloc_prod);
>> @@ -1691,7 +1939,10 @@ void xenvif_zerocopy_callback(struct ubuf_info *ubuf, 
>> bool zerocopy_success)
>>              smp_wmb();
>>              queue->dealloc_prod++;
>>      } while (ubuf);
>> -    wake_up(&queue->dealloc_wq);
>> +    /* Wake up only when there are grants to unmap */
>> +    if (dealloc_prod_save != queue->dealloc_prod)
>> +            wake_up(&queue->dealloc_wq);
>> +
>>      spin_unlock_irqrestore(&queue->callback_lock, flags);
>> 
>>      if (likely(zerocopy_success))
>> @@ -1779,10 +2030,13 @@ int xenvif_tx_action(struct xenvif_queue *queue, int 
>> budget)
>> 
>>      xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops);
>> 
>> -    if (nr_cops == 0)
>> +    if (!queue->vif->persistent_grants &&
>> +        nr_cops == 0)
> 
> You can just move nr_cops to previous line.
> 
> Wei.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.