[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] (repeatable) cross-domain networking failure
Nivedita Singhvi <niv@xxxxxxxxxx> wrote: I don't have boxes at the moment and can't reproduce till Monday, but can you show us the output of netstat -uan and netstat -s on both domains? Is there stuff in the receive or send queues? The detailed output of netstat follows. But their is neither anything in the send queue on domU, nor anything in the receive queue on dom0. (The UDP server in question is running on port 2000.) On dom0: $ netstat -uan Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State udp 0 0 0.0.0.0:1024 0.0.0.0:* udp 0 0 0.0.0.0:2049 0.0.0.0:* udp 0 0 0.0.0.0:514 0.0.0.0:* udp 0 0 0.0.0.0:1027 0.0.0.0:* udp 0 0 155.98.36.34:1028 155.98.32.70:8509 ESTABLISHED udp 0 0 0.0.0.0:775 0.0.0.0:* udp 0 0 0.0.0.0:653 0.0.0.0:* udp 0 0 192.168.0.1:2000 192.168.1.1:1024 ESTABLISHED udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 224.4.0.1:2917 0.0.0.0:* udp 0 0 0.0.0.0:111 0.0.0.0:* udp 0 0 0.0.0.0:759 0.0.0.0:* On domU: # netstat -uan Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State udp 0 0 192.168.1.1:1024 192.168.0.1:2000 ESTABLISHEDThe netstat -s output is a bit long, so I've attached those, instead of including them inline. And was all the udp traffic going to the same port? i.e. any successful udp traffic to another endpoint? All the traffic was going to port 2000. Trying to send UDP traffic from domU to a different port in dom0 (after the networking failure) does not succeed. (If you're asking if traffic could be sent to multiple ports while the networking is functional, I believe the answer is yes, but would double check.) What does ifconfig on dom0 show? Are there any error messages in /var/log/messages? $ ifconfig vif1.0 vif1.0 Link encap:Ethernet HWaddr AA:00:01:7B:92:C2 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:134 errors:0 dropped:0 overruns:0 frame:0 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5884 (5.7 Kb) TX bytes:676 (676.0 b) $ sudo tail /var/log/messages Jan 16 19:34:09 node1 ntpd[993]: kernel time sync disabled 0041 Jan 16 19:35:15 node1 ntpd[993]: kernel time sync enabled 0001 Jan 16 19:39:29 node1 ntpd[993]: synchronized to 155.98.33.74, stratum=2Jan 16 19:49:07 node1 ntpd[993]: time correction of -18001 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. Jan 16 19:59:15 node1 sshd(pam_unix)[1457]: session opened for user mukesh by (uid=30245) Jan 16 19:59:18 node1 sshd(pam_unix)[1486]: session opened for user mukesh by (uid=30245) Jan 16 19:59:30 node1 sshd(pam_unix)[1517]: session opened for user mukesh by (uid=30245) Jan 16 20:09:29 node1 modprobe: modprobe: Can't open dependencies file /lib/modules/2.4.27-xen0/modules.dep (No such file or directory) Jan 16 20:09:44 node1 last message repeated 2 times Jan 16 20:16:02 node1 kernel: device vif1.0 entered promiscuous mode Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer receives packets sent by D1. D1, however, does receive packets sent by D0. (To be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic. There is not UDP traffic sent from D0 to D1.)Is there any other successful traffic from D0 -> D1 (tcp?) Any traffic is successful from D0->D1, even after the network stops working. This includes ICMP, UDP, and TCP. (Sorry if my comment about "There is not UDP traffic sent from D0 to D1" was confusing. What I meant was that I wasn't sending and UDP traffic from D0 to D1. Not that such traffic fails.) This is subject to the limitation mentioned in my first message. Namely, that dom0's ARP cache entry for domU eventually times out. At that point, dom0 attempts to ARP for domU's MAC. domU sees this, and replies (as seen by tcpdump on domU). But dom0 never gets the ARP replies, so eventually D0->D1 traffic fails as well. (E.g. "telnet 192.168.1.1" returns "No route to host".) Also, let me add some more detail to my original report:1. The networking fails after the 128th UDP packet received in dom0, even if I restart domU. Specifically: - If I send one UDP packet from domU to dom0, shut down domU, and start a fresh domU, then I can only send 127 (rather than 128) UDP packets from the new domU before networking will fail. - If I shut down domU after the networking failure, and start a new domU, networking between the new domU and dom0 does not work. 2. The server run in dom0 is nc -l -u -p 2000 3. The traffic generator run in domU is i=0; while true; do ((++i)); echo $i echo $i | nc -u -w 1 192.168.0.1 2000 done & thanks, mukesh Attachment:
netstat-dom0.txt Attachment:
netstat-domU.txt
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |