[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-users] pv_ops kernel and network problems (checksum offloading?)
Hi list, I'm experiencing some very strange network problems when using a masquerading router domU with pv_ops kernels. First of all here is some ASCII art explaining my network configuration: +---------------------+ +--|-eth0 domU2 eth1-|-----+ | +---------------------+ | | | | +---------------------+ | | | domU1 eth1-|--+ | | +---------------------+ | | | | | +-------|---------------------------|--|--------+ | | vif2.0 vif1.1 | | vif2.1 | Internet | | | | <-----|----- brexternal dom0 brinternal | | eth0 | +-----------------------------------------------+ domU1 intentionally has no internet connection and domU2 acts as masquerading router for the internal network. Configuration is very very basic, on domU2 I've issued the following commands: # echo 1 > /proc/sys/net/ipv4/ip_forward # iptables -A POSTROUTING -t nat -s <internal/net> -j MASQUERADE Now the problems: 1. ICMP When I try to ping an internet host from domU1, dom0 kernel logs the following message for every ICMP echo request packet domU1 tries to send: --- cut --- Attempting to checksum a non-TCP/UDP packet, dropping a protocol 1 packet --- cut --- IP protocol 1 is ICMP, so this matches. Using tcpdump I've been able to follow the ping packets their way: domU1-eth1 -> vif1.1 -> brinternal -> vif2.1 -> domU2-eth1 -> domU2-eth0 The packet never reaches vif2.0 - it gets dropped somewhere between (according to the message I see, I would expect dom0 kernel to be the problem) Issuing the same ping command directly on domU2 works without any problems. 2. TCP When I try to connect to an internet host by TCP from domU1 I see a very very odd behavior: The TCP SYN packet leaves dom0 on eth0 as desired and reaches the remote host. But the remote host never responds with a SYN/ACK packet, so I took a deeper look with tcpdump and Wireshark: The packet *seems* to leave dom0 eth0 with correct TCP checksum but enters the remote host with TCP checksum ALWAYS set to 0xeeee - which is wrong of course, so the remote host drops the SYN packet. But I'm very sure the packet leaves dom0 with wrong checksum. Next I remembered the early XEN 3 days where we have been forced to use ethtool to disable checksum offloading everywhere, so I did the same: I used "ethtool -K <interface> tx off" for EVERY interface in the communication path (domU1-eth1, vif1.1, brinternal, vif2.1, domU2-eth1, domU2-eth0, vif2.0, brexternal and dom0-eth0) but the only effect this gives is that now I see the packet leaving dom0 at eth0 with a wrong checksum (0xeeee). I have no problem connecting to this host directly from domU2. My system configuration: Debian lenny amd64 everywhere XEN 3.4.2 (Debian unstable built for lenny) dom0 kernel: pv_ops from Jeremies tree (changeset 8735edb4a976105fd29c97c00c6d14760537e4ee) domU kernel: pv_ops 2.6.29-2 (from Debian unstable) (would like to go to newer kernel, but there's that other nasty bug :)) This looks like some sort of checksum offloading bug in pv_ops kernel tree that kicks in when using a domU to route (and masquerade) other traffic. Any ideas? Regards, Markus _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |