[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [UNIKRAFT PATCH v4 10/12] plat/xen/drivers/net: Add transmit operation
Hello Costin, Please find the review comments inline: Thanks & Regards Sharan On 8/21/20 3:04 PM, Simon Kuenzer wrote: On 21.08.20 13:24, Costin Lupu wrote:On 8/21/20 1:49 PM, Simon Kuenzer wrote:On 21.08.20 11:32, Costin Lupu wrote:On 8/20/20 6:49 PM, Simon Kuenzer wrote:On 13.08.20 10:53, Costin Lupu wrote:Whenever a packet is transmitted, the request describing it is writtenin theshared ring. The packet itself is written in a page referenced by therequest and shared, as well, with the backend. At the end of each transmit operation, acleanup operation is performed free'ing the resources allocated for thepreviously transmitted packets. Signed-off-by: Costin Lupu <costin.lupu@xxxxxxxxx> Signed-off-by: Razvan Cojocaru <razvan.cojocaru93@xxxxxxxxx> --- plat/xen/drivers/net/netfront.c | 102 ++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/plat/xen/drivers/net/netfront.c b/plat/xen/drivers/net/netfront.c index 858d124a..f2b81329 100644 --- a/plat/xen/drivers/net/netfront.c +++ b/plat/xen/drivers/net/netfront.c @@ -39,6 +39,7 @@ #include <uk/alloc.h> #include <uk/netdev_driver.h> #include <xen-x86/mm.h> +#include <xen-x86/irq.h> #include <xenbus/xenbus.h> #include "netfront.h" #include "netfront_xb.h" @@ -78,6 +79,106 @@ static uint16_t get_id_from_freelist(uint16_t *freelist) return id; } +static int network_tx_buf_gc(struct uk_netdev_tx_queue *txq) +{ + RING_IDX prod, cons; + netif_tx_response_t *tx_rsp; + uint16_t id; + bool more_to_do; + int count = 0; + + do { + prod = txq->ring.sring->rsp_prod; + rmb(); /* Ensure we see responses up to 'rp'. */ + + for (cons = txq->ring.rsp_cons; cons != prod; cons++) { + tx_rsp = RING_GET_RESPONSE(&txq->ring, cons); + + if (tx_rsp->status == NETIF_RSP_NULL) + continue; + + if (tx_rsp->status == NETIF_RSP_ERROR) + uk_pr_err("packet error\n");"netdev%u: Transmission error on txq %u\n" ?Ack.+ + id = tx_rsp->id; + UK_ASSERT(id < NET_TX_RING_SIZE); + + gnttab_end_access(txq->gref[id]); + txq->gref[id] = GRANT_INVALID_REF; + + add_id_to_freelist(id, txq->freelist); + uk_semaphore_up(&txq->sem); Shouldn't we free up the netbuf we receive from the ring? + + count++; + } + + txq->ring.rsp_cons = prod; + + RING_FINAL_CHECK_FOR_RESPONSES(&txq->ring, more_to_do); + } while (more_to_do); + + return count; +} + +static int netfront_xmit(struct uk_netdev *n, + struct uk_netdev_tx_queue *txq, + struct uk_netbuf *pkt) +{ + struct netfront_dev *nfdev; + unsigned long flags; + uint16_t id; + RING_IDX req_prod; + netif_tx_request_t *tx_req; + int notify; + int status = 0, count; + + UK_ASSERT(n != NULL); + UK_ASSERT(txq != NULL); + UK_ASSERT(pkt != NULL); + UK_ASSERT(pkt->len < PAGE_SIZE); + + nfdev = to_netfront_dev(n); + + /* get request id */ + uk_semaphore_down(&txq->sem); + local_irq_save(flags); + id = get_id_from_freelist(txq->freelist);What happens when we are out of space and the ring is full? The xmit function should be non-blocking so that we can use it in polling mode. We should return in such a case that the transmission queue is currently full...Ack.+ local_irq_restore(flags); + + /* get request */ + req_prod = txq->ring.req_prod_pvt; + tx_req = RING_GET_REQUEST(&txq->ring, req_prod); > + + /* setup grant for buffer data */ + txq->gref[id] = tx_req->gref = + gnttab_grant_access(nfdev->xendev->otherend_id, + virt_to_mfn(pkt->data), 1); + UK_ASSERT(tx_req->gref != GRANT_INVALID_REF); + + tx_req->offset = 0; + tx_req->size = (uint16_t) pkt->len; + tx_req->flags = 0; + tx_req->id = id; + + txq->ring.req_prod_pvt = req_prod + 1; + wmb(); /* Ensure backend sees requests */ + + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&txq->ring, notify); + if (notify) + notify_remote_via_evtchn(txq->evtchn); > + + status |= UK_NETDEV_STATUS_SUCCESS; + + /* some cleanup */ + local_irq_save(flags); + count = network_tx_buf_gc(txq);The clean-up should happen before enqueuing the given packet to thetransmit queue, it is the first thing of xmit(). Since we anyway need todo it and we do not have an advantage by delaying this operation.Of course there is an advantage. You push the packets in order for themto be processed ASAP by the backend and deal with cleaning up afterwards. It's a kind of parallelization. Otherwise you would delay the transmissions.We can also deal much better with full rings. Otherwise we unnecessarily drop a packets for transmission although there would have been space when clean-up finished. After calling cleaning you can check if there is space on the ring and return if we run out of it (use an `unlikely` to the condition so that we speed up the successful sending case).The only scenario when this would solve anything would be when you havethe ring full and just before you decide to send a new packet thebackend starts processing your previous transmitted packets, leaving youat least one free slot. There are 2 unlikely events here: first, the ring is full, and second, the backend starts to process again when the ring gets full (for the latter event, it's unlikely that the backend will wake up given that it couldn't process 256 transmitted packets, where 256 is the current number of slots). And for this very unlikely context you would introduce a delay for all the transmissions if we followed your suggestion. Anyway, both Linux and FreeBSD do the cleanup after pushing the requests, so I don't think we have much more to say about this.Hum... I double-checked with Intel DPDK's ixgbe (10 Gig) driver:http://git.dpdk.org/dpdk-stable/tree/drivers/net/ixgbe/ixgbe_rxtx.c?h=20.02#n230They clean the slots before sending. I derived my conclusion from there. However I noticed now that there is one minor difference: They do both - cleanup and enqueuing - before they notify the card about changes on thering. In that case it does not add any delay if you swap the order of cleanup and enqueuing. In your case this is probably different because you can cleanup without notifying. Here, I agree with you that we add unnecessary xmit delays by swapping the order.ixgbe is a native driver, so there is no backend driver in that case. It's a different scenario.You could consider the card itself as a backend driver, but fine. There is a ABI difference between ixgbe and netback.However and in general, having a full ring is not an uncommon and unlikely case. To my experience, this can already happen with a singleTCP connection if your network stack is feeding next packets fast enoughand if the current TCP window allows to send them. It also depends on when the driver domain is scheduled; on which CPU core your netfront domain and the driver domain sit; and even on the underlying physical device that might be busy with other traffic.What I tried to say with the events probabilities explanation is that you add delay for the common case in order to remove delay for the much less probable corner case.I got that and agreed.Another thing we are having in mind for uknetdev - and this is probably different to Linux/BSD's implementation - we plan to change the xmit function to do batching (as DPDK does). Sharan could push virtio-net performance further with this. You are able to reduce the number ofnotifies to the backend because you enqueue multiple packets at once anddue to locality of batching you can utilize CPU caches ways better. What about this idea for netfront: We could do first a check foravailable space and - if needed - clean-up just as many extra slots thatyou need to complete the transmission (for now this is anyways just one slot). Then we proceed with enqueuing and do the cleanup of the rest of the slots after we notified the backend. What do you think?This will stay just as it is for the current patches. Feel free to change anything after upstreaming these patches.No, it won't stay like this. At least, you have to call clean-up when the ring is full before you leave the non-blocking function. Otherwise you end up in never getting space back on a full ring, but I think this is clear.Btw, I'm waiting for reviews on the other patches as well before I sendthe v5.Yes, sure.+ local_irq_restore(flags); + + status |= likely(count > 0) ? UK_NETDEV_STATUS_MORE : 0x0; + + return status; +} + static int netfront_rxq_enqueue(struct uk_netdev_rx_queue *rxq, struct uk_netbuf *netbuf) { @@ -598,6 +699,7 @@ static int netfront_add_dev(struct xenbus_device *xendev) } /* register netdev */ + nfdev->netdev.tx_one = netfront_xmit; nfdev->netdev.ops = &netfront_ops; rc = uk_netdev_drv_register(&nfdev->netdev, drv_allocator, DRIVER_NAME); if (rc < 0) {Thanks, Simon
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |