[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/netfront: Fix TX response spurious interrupts



You also probably want to send this to linux kernel mailing list too.

Le 10/07/2025 à 18:14, Anthoine Bourgeois a écrit :
> We found at Vates that there are lot of spurious interrupts when
> benchmarking the PV drivers of Xen. This issue appeared with a patch
> that addresses security issue XSA-391 (see Fixes below). On an iperf
> benchmark, spurious interrupts can represent up to 50% of the
> interrupts.
>
> Spurious interrupts are interrupts that are rised for nothing, there is
> no work to do. This appends because the function that handles the
> interrupts ("xennet_tx_buf_gc") is also called at the end of the request
> path to garbage collect the responses received during the transmission
> load.
>
> The request path is doing the work that the interrupt handler should
> have done otherwise. This is particurary true when there is more than
> one vcpu and get worse linearly with the number of vcpu/queue.
>
> Moreover, this problem is amplifyed by the penalty imposed by a spurious
> interrupt. When an interrupt is found spurious the interrupt chip will
> delay the EOI to slowdown the backend. This delay will allow more
> responses to be handled by the request path and then there will be more
> chance the next interrupt will not find any work to do, creating a new
> spurious interrupt.
>
> This causes performance issue. The solution here is to remove the calls
> from the request path and let the interrupt handler do the processing of
> the responses. This approch removes spurious interrupts (<0.05%) and
> also has the benefit of freeing up cycles in the request path, allowing
> it to process more work, which improves performance compared to masking
> the spurious interrupt one way or another.
>
> Some vif throughput performance figures from a 8 vCPUs, 4GB of RAM HVM
> guest(s):
>
> Without this patch on the :
> vm -> dom0: 4.5Gb/s
> vm -> vm:   7.0Gb/s
>
> Without XSA-391 patch (revert of b27d47950e48):
> vm -> dom0: 8.3Gb/s
> vm -> vm:   8.7Gb/s
>
> With XSA-391 and this patch:
> vm -> dom0: 11.5Gb/s
> vm -> vm:   12.6Gb/s
>
> Fixes: b27d47950e48 ("xen/netfront: harden netfront against event channel 
> storms")
> Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx>
> ---
>   drivers/net/xen-netfront.c | 5 -----
>   1 file changed, 5 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 9bac50963477..a11a0e949400 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -638,8 +638,6 @@ static int xennet_xdp_xmit_one(struct net_device *dev,
>       tx_stats->packets++;
>       u64_stats_update_end(&tx_stats->syncp);
>
> -     xennet_tx_buf_gc(queue);
> -
>       return 0;
>   }
>
> @@ -849,9 +847,6 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, 
> struct net_device *dev
>       tx_stats->packets++;
>       u64_stats_update_end(&tx_stats->syncp);
>
> -     /* Note: It is not safe to access skb after xennet_tx_buf_gc()! */
> -     xennet_tx_buf_gc(queue);
> -
>       if (!netfront_tx_slot_available(queue))
>               netif_tx_stop_queue(netdev_get_tx_queue(dev, queue->id));
>

Is there a risk of having a condition where the ring is full and the
event channel is not up (which would cause the interrupt to never be
called, and no message to be received again) ?

Teddy


Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.