|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [UNIKRAFT PATCH v4 10/12] plat/xen/drivers/net: Add transmit operation
Hi Costin,since I worked on the lwip adoption, I found something in your tx function. Please see it inline. Thanks, Simon On 23.10.20 17:54, Costin Lupu wrote: Hi Sharan, Please see inline. On 10/22/20 4:38 PM, Sharan Santhanam wrote:Hello Costin, Please find the review comments inline: Thanks & Regards Sharan On 8/21/20 3:04 PM, Simon Kuenzer wrote:On 21.08.20 13:24, Costin Lupu wrote:On 8/21/20 1:49 PM, Simon Kuenzer wrote:On 21.08.20 11:32, Costin Lupu wrote:On 8/20/20 6:49 PM, Simon Kuenzer wrote:On 13.08.20 10:53, Costin Lupu wrote: I found that if you set the offset to the following, the driver does support sending of packets that are having a headroom. tx_req->offset = (uint16_t) uk_netbuf_headroom(pkt);At the function entrance I would also check that this is the only segment and that the pkt buffer is page aligned: UK_ASSERT(!pkt->next); UK_ASSERT(((unsigned long) pkt->buf & ~PAGE_MASK) == 0);Instead of building the grant with the data pointer, you could use the buf address instead. This one should be page aligned and the offset with headroom to it would be correctly point to data. Please also, double-check with series 1560: https://patchwork.unikraft.org/project/unikraft/list/?series=1560 With this you have your assumptions for a zero-copy tx confirmed. + tx_req->size = (uint16_t) pkt->len; + tx_req->flags = 0; + tx_req->id = id; + + txq->ring.req_prod_pvt = req_prod + 1; + wmb(); /* Ensure backend sees requests */ + + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&txq->ring, notify); + if (notify) + notify_remote_via_evtchn(txq->evtchn); > + + status |= UK_NETDEV_STATUS_SUCCESS; + + /* some cleanup */ + local_irq_save(flags); + count = network_tx_buf_gc(txq);The clean-up should happen before enqueuing the given packet to the transmit queue, it is the first thing of xmit(). Since we anyway need to do it and we do not have an advantage by delaying this operation.Of course there is an advantage. You push the packets in order for them to be processed ASAP by the backend and deal with cleaning up afterwards. It's a kind of parallelization. Otherwise you would delay the transmissions.We can also deal much better with full rings. Otherwise we unnecessarily drop a packets for transmission although there would have been space when clean-up finished. After calling cleaning you can check if there is space on the ring and return if we run out of it (use an `unlikely` to the condition so that we speed up the successful sending case).The only scenario when this would solve anything would be when you have the ring full and just before you decide to send a new packet the backend starts processing your previous transmitted packets, leaving you at least one free slot. There are 2 unlikely events here: first, the ring is full, and second, the backend starts to process again when the ring gets full (for the latter event, it's unlikely that the backend will wake up given that it couldn't process 256 transmitted packets, where 256 is the current number of slots). And for this very unlikely context you would introduce a delay for all the transmissions if we followed your suggestion. Anyway, both Linux and FreeBSD do the cleanup after pushing the requests, so I don't think we have much more to say about this.Hum... I double-checked with Intel DPDK's ixgbe (10 Gig) driver: http://git.dpdk.org/dpdk-stable/tree/drivers/net/ixgbe/ixgbe_rxtx.c?h=20.02#n230 They clean the slots before sending. I derived my conclusion from there. However I noticed now that there is one minor difference: They do both - cleanup and enqueuing - before they notify the card about changes on the ring. In that case it does not add any delay if you swap the order of cleanup and enqueuing. In your case this is probably different because you can cleanup without notifying. Here, I agree with you that we add unnecessary xmit delays by swapping the order.ixgbe is a native driver, so there is no backend driver in that case. It's a different scenario.You could consider the card itself as a backend driver, but fine. There is a ABI difference between ixgbe and netback.However and in general, having a full ring is not an uncommon and unlikely case. To my experience, this can already happen with a single TCP connection if your network stack is feeding next packets fast enough and if the current TCP window allows to send them. It also depends on when the driver domain is scheduled; on which CPU core your netfront domain and the driver domain sit; and even on the underlying physical device that might be busy with other traffic.What I tried to say with the events probabilities explanation is that you add delay for the common case in order to remove delay for the much less probable corner case.I got that and agreed.Another thing we are having in mind for uknetdev - and this is probably different to Linux/BSD's implementation - we plan to change the xmit function to do batching (as DPDK does). Sharan could push virtio-net performance further with this. You are able to reduce the number of notifies to the backend because you enqueue multiple packets at once and due to locality of batching you can utilize CPU caches ways better. What about this idea for netfront: We could do first a check for available space and - if needed - clean-up just as many extra slots that you need to complete the transmission (for now this is anyways just one slot). Then we proceed with enqueuing and do the cleanup of the rest of the slots after we notified the backend. What do you think?This will stay just as it is for the current patches. Feel free to change anything after upstreaming these patches.No, it won't stay like this. At least, you have to call clean-up when the ring is full before you leave the non-blocking function. Otherwise you end up in never getting space back on a full ring, but I think this is clear.Btw, I'm waiting for reviews on the other patches as well before I send the v5.Yes, sure.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |