[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC V2] xen/netback: Count ring slots properly when larger MTU sizes are used
> -----Original Message----- > From: Matt Wilson [mailto:msw@xxxxxxxxxx] > Sent: Wednesday, December 12, 2012 3:05 AM > To: Palagummi, Siva > Cc: Ian Campbell; xen-devel@xxxxxxxxxxxxx > Subject: Re: [Xen-devel] [PATCH RFC V2] xen/netback: Count ring slots > properly when larger MTU sizes are used > > On Tue, Dec 11, 2012 at 10:25:51AM +0000, Palagummi, Siva wrote: > > > -----Original Message----- > > > From: Matt Wilson [mailto:msw@xxxxxxxxxx] > > > Sent: Thursday, December 06, 2012 11:05 AM > > > To: Palagummi, Siva > > > Cc: Ian Campbell; xen-devel@xxxxxxxxxxxxx > > > Subject: Re: [Xen-devel] [PATCH RFC V2] xen/netback: Count ring > slots > > > properly when larger MTU sizes are used > > > > > > On Wed, Dec 05, 2012 at 11:56:32AM +0000, Palagummi, Siva wrote: > > > > Matt, > > > [...] > > > > You are right. The above chunk which is already part of the > upstream > > > > is unfortunately incorrect for some cases. We also ran into > issues > > > > in our environment around a week back and found this problem. The > > > > count will be different based on head len because of the > > > > optimization that start_new_rx_buffer is trying to do for large > > > > buffers. A hole of size "offset_in_page" will be left in first > page > > > > during copy if the remaining buffer size is >=PAG_SIZE. This > > > > subsequently affects the copy_off as well. > > > > > > > > So xen_netbk_count_skb_slots actually needs a fix to calculate > the > > > > count correctly based on head len. And also a fix to calculate > the > > > > copy_off properly to which the data from fragments gets copied. > > > > > > Can you explain more about the copy_off problem? I'm not seeing it. > > > > You can clearly see below that copy_off is input to > > start_new_rx_buffer while copying frags. > > Yes, but that's the right thing to do. copy_off should be set to the > destination offset after copying the last byte of linear data, which > means "skb_headlen(skb) % PAGE_SIZE" is correct. > No. It is not correct for two reasons. For example what if skb_headlen(skb) is exactly a multiple of PAGE_SIZE. Copy_off would be set to ZERO. And now if there exists some data in frags, ZERO will be passed in as copy_off value and start_new_rx_buffer will return FALSE. And second reason is the obvious case from the current code where "offset_in_page(skb->data)" size hole will be left in the first buffer after first pass in case remaining data that need to be copied is going to overflow the first buffer. > > So if the buggy "count" calculation below is fixed based on > > offset_in_page value then copy_off value also will change > > accordingly. > > This calculation is not incorrect. You should only need as many > PAGE_SIZE buffers as you have linear data to fill. > This calculation is incorrect and do not match actual slots used as it is now unless some new change is done either in nebk_gop_skb or in start_new_rx_buffer. > > count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE); > > > > copy_off = skb_headlen(skb) % PAGE_SIZE; > > > > if (skb_shinfo(skb)->gso_size) > > count++; > > > > for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { > > unsigned long size = skb_frag_size(&skb_shinfo(skb)- > >frags[i]); > > unsigned long bytes; > > while (size > 0) { > > BUG_ON(copy_off > MAX_BUFFER_OFFSET); > > > > if (start_new_rx_buffer(copy_off, size, 0)) { > > count++; > > copy_off = 0; > > } > > > > > > So a correct calculation should be somewhat like below because of > > the optimization in start_new_rx_buffer for larger sizes. > > start_new_rx_buffer() should not be starting a new buffer after the > first pass copying the linear data. > > > linear_len = skb_headlen(skb) > > count = (linear_len <= PAGE_SIZE) > > ? 1 > > :DIV_ROUND_UP(offset_in_page(skb->data)+linear_len, > PAGE_SIZE)); > > > > copy_off = ((offset_in_page(skb->data)+linear_len) < > 2*PAGE_SIZE) > > ? linear_len % PAGE_SIZE; > > : (offset_in_page(skb->data)+linear_len) % PAGE_SIZE; > > A change like this makes the code much more difficult to understand. > :-) It would have been easier had we written logic using a for loop similar to how the counting is done for data in frags. In fact I did do mistake in above calculations :-( . A proper logic probably should look somewhat like below. linear_len=skb_headlen(skb); count = (linear_len <= PAGE_SIZE) ? 1 :DIV_ROUND_UP(offset_in_page(skb->data)+linear_len,PAGE_SIZE); copy_off = (linear_len <= PAGE_SIZE) ?linear_len :( offset_in_page(skb->data)+linear_len -1)%PAGE_SIZE+1; > > > > max_required_rx_slots also may require a fix to account the > > > > additional slot that may be required in case mtu >= PAG_SIZE. For > > > > worst case scenario atleast another +1. One thing that is still > > > > puzzling here is, max_required_rx_slots seems to be assuming that > > > > linear length in head will never be greater than mtu size. But > that > > > > doesn't seem to be the case all the time. I wonder if it requires > > > > some kind of fix there or special handling when count_skb_slots > > > > exceeds max_required_rx_slots. > > > > > > We should only be using the number of pages required to copy the > > > data. The fix shouldn't be to anticipate wasting ring space by > > > increasing the return value of max_required_rx_slots(). > > > > > > > I do not think we are wasting any ring space. But just ensuring that > > we have enough before proceeding ahead. > > For some SKBs with large linear buffers, we certainly are wasting > space. Go back and read the explanation in > http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html > I think I probably did not put my point clearly to make it understandable. Xen_netbk_rx_ring_full uses max_required_rx_slots value. Xen_netbk_rx_ring_full is called to decide wither a vif is schedulable or not. So in case the mtu value is >=PAGE_SIZE, for a worst case scenario additional buffer would be required which is not taken care by current calculations. Ofcourse in your new fix if you do a code change not to leave a hole in first buffer then this correction may not be required. But I am not the right person to decide the implications of the fix you are proposing. The current start_new_rx_buffer seems to be trying to make the copies PAGE aligned and also reduce number of copy operations. For example let us say SKB_HEAD_LEN is for whatever reason 4*PAGE_SIZE and offset_in_page is 32. As per existing logic of start_new_rx_buffer and with the fix I am proposing for count and copy_off, we will calculate and occupy 5 ring buffers and will use 5 copy operations. If we fix it the way you are proposing, not to leave a hole in the first buffer by modifying start_new_rx_buffer then it will occupy 4 ring buffers but will require 8 copy operations as per existing logic in netbk_gop_skb while copying head!! Thanks Siva > > > [...] > > > > > > > > Why increment count by the /estimated/ count instead of the > actual > > > > > number of slots used? We have the number of slots in the line > just > > > > > above, in sco->meta_slots_used. > > > > > > > > > > > > > Count actually refers to ring slots consumed rather than > meta_slots > > > > used. Count can be different from meta_slots_used. > > > > > > Aah, indeed. This can end up being too pessimistic if you have lots > of > > > frags that require multiple copy operations. I still think that it > > > would be better to calculate the actual number of ring slots > consumed > > > by netbk_gop_skb() to avoid other bugs like the one you originally > > > fixed. > > > > > > > counting done in count_skb_slots is what exactly it is. The fix done > > above is to make it same so that no need to re calculate again. > > Today, the counting done in count_skb_slots() *does not* match the > number of buffer slots consumed by netbk_gop_skb(). > > Matt _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |