[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fwd: [Xen-users] Xen, NIC bonding, ARP problem]




I don't know if there was a problem in the archive or not, I just wanted to link to this post of mine, but I could not find it in the archive. So here's a re-post.

-------- Original-Nachricht --------
Betreff: [Xen-users] Xen, NIC bonding, ARP problem
Datum: Fri, 30 Mar 2007 10:05:45 +0200
Von: Dominik Klein <dk@xxxxxxxxxxxxxxxx>
An: xen-users@xxxxxxxxxxxxxxxxxxx

Hi xen-users

Xen seems to have problems with NIC bonding and ARP protocol. I
experienced the very same as described here:
http://arcknowledge.com/gmane.comp.emulators.xen.user/2005-10/msg00154.html

In short:
ping from domU to a host on the same network works after about 10
seconds (this actually varies, but it fairly sure works after 10 seconds).
tcpdump shows ARP-replies from the target, but arp -a in domU shows they
don't make it to domU immediately. Replies need to be send a couple of
times before they get to domU.
tcpdump also shows, that each ARP-request is sent twice.

Hard-coding MAC-addresses to the arp-table in domU solves this problem,
but that does not seem like a good solution.

Just to be sure we are talking about the same thing: I am talking about
"active-backup" bonding.
(see /usr/src/linux/Documentation/networking/bonding.txt
or eg http://www.mjmwired.net/kernel/Documentation/networking/bonding.txt)

The network-script used to configure Xen for NIC bonding is listed here:
http://lists.xensource.com/archives/html/xen-users/2006-04/msg00186.html

And I think I have found the reason, why the described problems happen:

In a normal xen setup, network looks like this:
domU (say ID=1) sees eth0
this eth0 is represented as vif1.0 in dom0 and connected to xenbr0,
which goes "out" through peth0

brctl show
bridge name     bridge id               STP enabled     interfaces
xenbr0          8000.feffffffffff       no              vif0.0
                                                        peth0
                                                        vif1.0

Neither xenbr0 nor vif1.0 nor peth0 reply to ARP requests. ARP is
completely handled by domU.

excerpt from ip addr list
xenbr0: <BROADCAST,NOARP,UP>
vif1.0: <BROADCAST,NOARP,UP>
peth0: <BROADCAST,NOARP,UP>

Furthermore, none of these interfaces has an IP-address assigned (in dom0).

In an active-passive NIC bonding setup, both NICs used for the bonding
device are configured with the same MAC address, but only the currently
active one has the ARP flag set (ip link set $dev arp on).
In my case, bond0 is made of eth0 and eth2. eth2 is the currently active
NIC.

So the network looks like this in a bonding setup with xen:
domU (again, ID=1) sees eth0
this eth0 is represented as vif1.0 in dom0 and connected to xenbr0,
which goes "out" through bond0

brctl show xenbr0
bridge name     bridge id               STP enabled     interfaces
xenbr0          8000.000423c0b33c       no              vif0.0
                                                        bond0
                                                        vif1.0

Now comes the tricky part. ARP is *not* deactivated for xenbr0 and bond0.

excerpt from ip addr list
xenbr0: <BROADCAST,MULTICAST,UP>
vif1.0: <BROADCAST,NOARP,UP>
bond0: <BROADCAST,MULTICAST,MASTER,UP>
eth0: <BROADCAST,MULTICAST,NOARP,SLAVE,UP>
eth2: <BROADCAST,MULTICAST,SLAVE,UP>
remember: eth2 is the currently active card in bond0!

And also: bond0 AND xenbr0 have IP-addresses assigned. This seemed weird
to me in the first place, but I didnt actually know what to do about this.
Another weird thing in comparison to a "normal xen network setup" is
that the network-bridge-bonding script does not create peth[02], but
keeps using bond0 "as is".

So here's what I already tried to solve this problem:

ip link set bond0 arp off
no difference

ip link set xenbr0 arp off
no difference in domU
dom0 can no longer talk to unknown hosts, as it does not do any ARP any more

ip addr purge bond0
no difference

ip addr purge xenbr0
no difference in domU
dom0 can no longer do any IP-networking, as xenbr0 is the device which
routes are set for
adding appropriate routes for bond0 (which still has the addresses) does
not solve this problem.

So if anybody has an idea how to get NIC bonding work together with Xen,
please let me know. If you need any more information, just ask. I know
this is a fairly complex situation but I would really appreciate some
help here.

For completeness: I am using openSuSE 10.2 with Xen 3.0.4 and kernel
2.6.16.33-xen in dom0 and domU.

Regards
Dominik

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.