[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH net-next 2/2] xen-netback: Turn off the carrier if the guest is not able to receive



On 05/08/14 13:45, Wei Liu wrote:
On Mon, Aug 04, 2014 at 04:20:58PM +0100, Zoltan Kiss wrote:
[...]
  struct xenvif {
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index fbdadb3..48a55cd 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -97,7 +101,16 @@ int xenvif_poll(struct napi_struct *napi, int budget)
  static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
  {
        struct xenvif_queue *queue = dev_id;
+       struct netdev_queue *net_queue =
+               netdev_get_tx_queue(queue->vif->dev, queue->id);

+       /* QUEUE_STATUS_RX_PURGE_EVENT is only set if either QDisc was off OR
+        * the carrier went down and this queue was previously blocked
+        */

Could you change "blocked" to "stalled" so that the comment matches the
code closely?
Ok

@@ -1935,6 +1934,75 @@ static void xenvif_start_queue(struct xenvif_queue 
*queue)
                xenvif_wake_queue(queue);
  }

+/* Only called from the queue's thread, it handles the situation when the guest
+ * doesn't post enough requests on the receiving ring.
+ * First xenvif_start_xmit disables QDisc and start a timer, and then either 
the
+ * timer fires, or the guest send an interrupt after posting new request. If it
+ * is the timer, the carrier is turned off here.
+ * */

Please remove that extra "*".
Ok

+static void xenvif_rx_purge_event(struct xenvif_queue *queue)
+{
+       /* Either the last unsuccesful skb or at least 1 slot should fit */
+       int needed = queue->rx_last_skb_slots ?
+                    queue->rx_last_skb_slots : 1;
+
+       /* It is assumed that if the guest post new slots after this, the RX
+        * interrupt will set the QUEUE_STATUS_RX_PURGE_EVENT bit and wake up
+        * the thread again
+        */

Basically in this state machine you have a tuple (RX_STALLED bit,
PURGE_EVENT bit, carrier state). This whole state transaction is very
scary, any chance you can draw a graph like the xenbus state machine in
xenbus.c?

I fear that after three month noone can easily understand this code
unless he / she spends half a day reading the code. And without defining
what state is legal it's very hard to tell what behavior is expected and
what is not.
Ok


+       set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
+       if (!xenvif_rx_ring_slots_available(queue, needed)) {
+               rtnl_lock();
+               if (netif_carrier_ok(queue->vif->dev)) {
+                       /* Timer fired and there are still no slots. Turn off
+                        * everything except the interrupts
+                        */
+                       netif_carrier_off(queue->vif->dev);
+                       skb_queue_purge(&queue->rx_queue);
+                       queue->rx_last_skb_slots = 0;
+                       if (net_ratelimit())
+                               netdev_err(queue->vif->dev, "Carrier off due to lack of 
guest response on queue %d\n", queue->id);

Line too long.
Ok


@@ -1944,8 +2012,12 @@ int xenvif_kthread_guest_rx(void *data)
                wait_event_interruptible(queue->wq,
                                         rx_work_todo(queue) ||
                                         queue->vif->disabled ||
+                                        test_bit(QUEUE_STATUS_RX_PURGE_EVENT, 
&queue->status) ||

Line too long.
Ok


                                         kthread_should_stop());

+               if (kthread_should_stop())
+                       break;
+
                /* This frontend is found to be rogue, disable it in
                 * kthread context. Currently this is only set when
                 * netback finds out frontend sends malformed packet,
@@ -1955,24 +2027,21 @@ int xenvif_kthread_guest_rx(void *data)
                 */
                if (unlikely(queue->vif->disabled && queue->id == 0))
                        xenvif_carrier_off(queue->vif);

I think you also need to check vif->disabled flag in your following code
so that you don't accidently re-enable a rogue vif in a queue whose id
!= 0.
Yes.

Further more "disabled" can be transformed to a bit in vif->status.
You can incorporate such change in your previous patch or a separate
prerequisite patch.
Yes, I've already done that on my non-multiqueue branch.

-
-               if (kthread_should_stop())
-                       break;
-
-               if (queue->rx_queue_purge) {
+               else if 
(unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
+                                                    &queue->status))) {
+                       xenvif_rx_purge_event(queue);
+               } else if (!netif_carrier_ok(queue->vif->dev)) {
+                       /* Another queue stalled and turned the carrier off, so
+                        * purge the internal queue of queues which were not
+                        * blocked
+                        */

"blocked" -> "stalled"?
Ok

In theory even one queue stalls all other queues can still make
progress, isn't it?
This patch makes sure that if a queue is stalled, none of the others can transmit, even if they would be able to do so. It is documented at the definition of QUEUE_STATUS_RX_STALLED.


Wei.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.