[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
On 2013-7-11 3:37, Wei Liu wrote: On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:From: Xi Xiong <xixiong@xxxxxxxxxx> [ note: I've just cherry picked this onto net-next, and only compile tested. This a RFC only. -msw ]Should probably rebase it on net.git because it is a bug fix. Let's worry about that later...Currently the number of RX slots required to transmit a SKB to xen-netfront can be miscalculated when an interface uses a MTU larger than PAGE_SIZE. If the slot calculation is wrong, xen-netback can pause the queue indefinitely or reuse slots. The former manifests as a loss of connectivity to the guest (which can be restored by lowering the MTU set on the interface). The latter manifests with "Bad grant reference" messages from Xen such as: (XEN) grant_table.c:1797:d0 Bad grant reference 264241157 and kernel messages within the guest such as: [ 180.419567] net eth0: Invalid extra type: 112 [ 180.868620] net eth0: rx->offset: 0, size: 4294967295 [ 180.868629] net eth0: rx->offset: 0, size: 4294967295 BUG_ON() assertions can also be hit if RX slots are exhausted while handling a SKB. This patch changes xen_netbk_rx_action() to count the number of RX slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1. This prevents under-counting the number of RX slots consumed when a SKB has a large linear buffer. Additionally, we now store the estimated number of RX slots required to handle a SKB in the cb overlay. This value is used to determine if the next SKB in the queue can be processed. Finally, the logic in start_new_rx_buffer() can cause RX slots to be wasted when setting up copy grant table operations for SKBs with large linear buffers. For example, a SKB with skb_headlen() equal to 8157 bytes that starts 64 bytes 64 bytes from the start of the page willDuplicated "64 bytes". And this change looks like an improvement not a bug fix. Probably submit a separate patch for this?consume three RX slots instead of two. This patch changes the "head" parameter to netbk_gop_frag_copy() to act as a flag. When set, start_new_rx_buffer() will always place as much data as possible into each RX slot. Signed-off-by: Xi Xiong <xixiong@xxxxxxxxxx> Reviewed-by: Matt Wilson <msw@xxxxxxxxxx> [ msw: minor code cleanups, rewrote commit message, adjusted code to count RX slots instead of meta structures ] Signed-off-by: Matt Wilson <msw@xxxxxxxxxx> Cc: Annie Li <annie.li@xxxxxxxxxx> Cc: Wei Liu <wei.liu2@xxxxxxxxxx> Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxx> Cc: netdev@xxxxxxxxxxxxxxx Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx --- drivers/net/xen-netback/netback.c | 51 ++++++++++++++++++++++-------------- 1 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 64828de..82dd207 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -110,6 +110,11 @@ union page_ext { void *mapping; };+struct skb_cb_overlay {+ int meta_slots_used; + int peek_slots_count; +}; + struct xen_netbk { wait_queue_head_t wq; struct task_struct *task; @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb) { unsigned int count; int i, copy_off; + struct skb_cb_overlay *sco;count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE); @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)offset = 0; } } + + sco = (struct skb_cb_overlay *) skb->cb; + sco->peek_slots_count = count; return count; }@@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,}/*- * Set up the grant operations for this fragment. If it's a flipping - * interface, we also set up the unmap request from here. + * Set up the grant operations for this fragment. */ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, struct netrx_pending_operations *npo, struct page *page, unsigned long size, - unsigned long offset, int *head) + unsigned long offset, int head, int *first) { struct gnttab_copy *copy_gop; struct netbk_rx_meta *meta; @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, if (bytes > size) bytes = size;- if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {+ if (start_new_rx_buffer(npo->copy_off, bytes, head)) { /* * Netfront requires there to be some data in the head * buffer. */ - BUG_ON(*head); + BUG_ON(*first);meta = get_next_rx_buffer(vif, npo);} @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb, }/* Leave a gap for the GSO descriptor. */- if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix) + if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix) vif->rx.req_cons++;- *head = 0; /* There must be something in this buffer now. */+ *first = 0; /* There must be something in this buffer now. */}} @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb, struct xen_netif_rx_request *req; struct netbk_rx_meta *meta; unsigned char *data; - int head = 1; + int first = 1; int old_meta_prod;old_meta_prod = npo->meta_prod;@@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb, len = skb_tail_pointer(skb) - data;netbk_gop_frag_copy(vif, skb, npo,- virt_to_page(data), len, offset, &head); + virt_to_page(data), len, offset, 1, &first); data += len; }for (i = 0; i < nr_frags; i++) {netbk_gop_frag_copy(vif, skb, npo, - skb_frag_page(&skb_shinfo(skb)->frags[i]), - skb_frag_size(&skb_shinfo(skb)->frags[i]), - skb_shinfo(skb)->frags[i].page_offset, - &head); + skb_frag_page(&skb_shinfo(skb)->frags[i]), + skb_frag_size(&skb_shinfo(skb)->frags[i]), + skb_shinfo(skb)->frags[i].page_offset, + 0, &first); }return npo->meta_prod - old_meta_prod;@@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status, } }-struct skb_cb_overlay {- int meta_slots_used; -}; - static void xen_netbk_rx_action(struct xen_netbk *netbk) { struct xenvif *vif = NULL, *tmp; @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk) count = 0;while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {+ RING_IDX old_rx_req_cons; + vif = netdev_priv(skb->dev); nr_frags = skb_shinfo(skb)->nr_frags;+ old_rx_req_cons = vif->rx.req_cons;sco = (struct skb_cb_overlay *)skb->cb; sco->meta_slots_used = netbk_gop_skb(skb, &npo);- count += nr_frags + 1;+ count += vif->rx.req_cons - old_rx_req_cons;__skb_queue_tail(&rxq, skb); + skb = skb_peek(&netbk->rx_queue);+ if (skb == NULL) + break; + sco = (struct skb_cb_overlay *) skb->cb; + /* Filled the batch queue? */ - /* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */ - if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE) + if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE) break;Using req_cons to count is OK, but since you've already store the number of slots in sco->peek_slots_count before actually queueing the packet, why don't you use that directly?Ah, I must be off my head here. I mixed several patches 1) my RFC patch that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your patch. The reason that your patches uses req_cons instead of pre-calculated value is, that value returned by xen_netbk_skb_count_slots is actually *wrong* -- that's what Annie tried to fix in her patch. After fixing xen_netbk_skb_count_slots, we would need the above snippet (or the snippet I proposed in my RFC patch) to prevent overruning the ring. So a proper patch to this issue (not couting RX slots correctly causing overrun of the ring) would be the above two aspects combined. Comments? I agree, two patches looks fine.xen_netbk_skb_count_slots returns wrong value, this could be fixed by my patch. protection from overrunning the ring, this could be implemented by Wei's/Matt's patch. Thanks Annie Wei.Wei.}--1.7.4.5-- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |