[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (repeatable) cross-domain networking failure



Maybe add some tracing to the backend driver -- it's possible the
backend isn't sending responses for those packets back to domU, and so
things seize up for a while. If no responses are being generated it is
because the backend thinks the packets are still in flight, so there
would be some bug-hunting to find out why that is.

 -- Keir

> 
> Summary:
> 
> After sending some UDP traffic between two xen domains (Domain 0 and 
> Domain 1) the networking between the domains fails. This failure is 100% 
> repeatable.
> 
> In more detail:
> 
> I have two xen domains. They run the kernels from the 2.0.3 release. (I've 
> run 
> into the same problem with 2.0.1 as well.) Domain 0 has 5 physical ethernet 
> interfaces, and a virtual interface to Domain 1. Domain 1 has just the 
> virtual 
> interface to Domain 0.
> 
> D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The 
> netmask is set to 255.255.0.0.
> 
> When I bring up D1, I can ping D1 from D0, ssh into D1, etc.
> 
> I then start a UDP server in D0, and a traffic generator in D1. After the 
> traffic generator sends its 128-th packet, networking between the domains 
> fails. The 128th packet is received successfully by the UDP server, but no 
> later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP.
> 
> Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer 
> receives packets sent by D1. D1, however, does receive packets sent by D0. 
> (To 
> be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic. 
> There is not UDP traffic sent from D0 to D1.)
> 
> (I suspect the stuff in this paragraph doesn't matter, but include it for 
> completeness.) Eventually, D0's ARP cache entry for D1 expires. D0 ARPs for 
> D1, 
> and D1 replies. But D0 never receives these replies. And eventually, D1 stops 
> replying to the ARPs entirely. (D1's sending behavior is observed via tcpdump 
> running in the console connection to D1.)
> 
> Note that the networking failure only occurs if the UDP packets are delivered 
> to a user-level process in D0. In particular, UDP traffic to D0's kernel NFS 
> server does not induce the failure. Nor does traffic sent to D0 for which 
> there 
> is no user process to accept the packets. And neither does traffic which is 
> forwarded on to other hosts via NAT. (I haven't tested the regular forwarding 
> case.)
> 
> Also, for what it's worth, Domain 0's network connectivity on its other 
> interfaces (which are connected to the world at large) are unaffected.
> 
> Looking through the mailing list archive, I saw a prior bug that seemed 
> similar, but involved IP fragmentation. That is not the case here, as the UDP 
> packets sent by D1 are small (<100 bytes).
> 
> Any suggestions for debugging this?
> 
> Thanks,
> mukesh
> 
> 
> -------------------------------------------------------
> The SF.Net email is sponsored by: Beat the post-holiday blues
> Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/xen-devel
> 



-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.