[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Network Issues on Migration
On Fri, Jan 09, 2009 at 02:17:34PM -0500, Wendell Dingus wrote: > > I've read and experimented extensively and being in desperate need of > "finishing" this setup and getting it deployed live, would like to see if > anyone has any suggestions on the last hangup we seem to have. > > Two SuperMicro 1U servers with dual quad-core CPUs and 16GB RAM each. CentOS > 5.2 x86_64 and it's xen implementation. The only thing non "stock" CentOS at > this point are the Intel IGB drivers. The RHEL/CentOS drivers for Intel IGB > appear to have a bug with DHCP over a bridged interface which the latest > drivers downloaded straight from Intel cured for us. > > Anyway, both are attached to shared FC storage and are doing RHCS with both > IP and disk-based quorum. CLVMD with a shared VG for creating LV's in as > containers for VMs. That part is all working very good. > > Each DOM0 has 2 physical NICs and both are bridged. Additionally we added a > virbr0 as a bridged per-DOM0 local network as well. > > When any VM boots up it can ping and traceroute on any of it's respective > networks perfectly. Inbound/outbound data flow of any kind appears perfect as > well. Once a VM is migrated or live-migrated to the other DOM0 though the > ability to ping or traceroute ceases. Sessions via ssh or httpd either > inbound or outbound continue to work fine though. > > When a VM boots I see this in dmesg: > netfront: Initialising virtual ethernet driver. > netfront: device eth0 has flipping receive path. > > I read something about a CRC problem and had each of them do "ethtool -K > eth{n} tx off" but don't think that was necessary in this instance, I've > never seen any error messages about CRC errors. The described problem and > solution I followed was not heavily detailed and it was just an attempt to > see if that helped with the problem. > > The following was added to the end of /etc/sysctl.conf on both DOM0's only > (per the excellent wiki article): > net.ipv4.icmp_echo_ignore_broadcasts = 1 > net.ipv4.conf.all.accept_redirects = 0 > net.ipv4.conf.all.send_redirects = 0 > > The other oddity about this is that a VM started on server1 and live migrated > to server2, a running ping only pauses a short while then picks right back up > and continues to be successful. Migrating it back to server1 or initially > starting a VM on server2 and migrating it to server1 is where the ping > "stuck" issue comes into play. We were very careful and documented well as we > installed both boxes, in an attempt to keep them as identical as possible. I > fear this behavior proves that's not the case though, ugh... > > After migrating from 2 to 1 and then trying a ping (and waiting a good logn > while before ctrl-c'ing this): > PING 192.168.77.1 (192.168.77.1) 56(84) bytes of data. > 64 bytes from 192.168.77.1: icmp_seq=1 ttl=64 time=0.000 ms > > --- 192.168.77.1 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 0.000/0.000/0.000/0.000 ms > > Very strange... Additionally a "service network restart" at this point > results in all interfaces going down, loopback being reinitialized and then > it hangs on trying to bring up eth0. I can ctrl-c it three times as it pauses > on each interface, then "ifconfig" and see all the IPs are still there. Still > can't ping but can "telnet google.com 80" for instance. Odd... > > So anyway, any pointers or suggestions you might have, would be greatly > appreciated... > https://www.redhat.com/archives/rhelv5-announce/2008-October/msg00000.html Some entries from the RHEL 5.3 beta changelog: + Timer problems after migration were fixed + Lengthy network outage after migrations was fixed Dunno if it's that what you're seeing.. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |