[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] using ipoib with xcp
On Fri, Apr 16, 2010 at 01:14:46PM +0200, Trygve Sanne Hardersen wrote: > I've looked a bit into the Open vSwtich source code and it seems to me > like MAC addresses can only be 6 bytes, but the IB addresses are 20 bytes. > I'm also seeing this in the Open vSwitch log: > |00043|bridge|INFO|created port ib0 on bridge brib0 > |00044|dpif|WARN|dp0: failed to add ib0 as port: Invalid argument > |00045|bridge|ERR|failed to add ib0 interface to dp0: Invalid argument > |00046|bridge|ERR|ib0 interface not in dp0, dropping > |00047|bridge|ERR|ib0 port has no interfaces, dropping > I've tried to report this on the Open vSwitch discuss list, but my > messages do not seem to get through. Did you subscribe to the list? -- Pasi > Thanks! > Trygve > On Wed, Apr 14, 2010 at 6:23 PM, Trygve Sanne Hardersen > <[1]trygve@xxxxxxxxxxxxx> wrote: > > I've finally got to spend some time looking further into this. > I now believe the underlaying problem is that Open vSwitch is unable to > connect the brib0 bridge interface to the ib0 physical interface. I > suspect the cause of this to be the long MAC address of the Infiniband > NICs, but so far I have not found a workaround for the issue. > These are the relevant devices for my setup: > [root@hypoxcp1 ~]# ifconfig > brib0 Link encap:Ethernet HWaddr 80:00:00:48:FE:80 > inet addr:10.1.2.2 Bcast:10.1.2.255 Mask:255.255.255.0 > inet6 addr: fe80::8200:ff:fe48:fe80/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:12 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:720 (720.0 b) > eth0 Link encap:Ethernet HWaddr 00:30:48:CC:5C:A4 > inet6 addr: fe80::230:48ff:fecc:5ca4/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:31755 errors:0 dropped:0 overruns:0 frame:0 > TX packets:10544 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:4284224 (4.0 MiB) TX bytes:1433336 (1.3 MiB) > ib0 Link encap:InfiniBand HWaddr > 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 > inet addr:10.1.2.102 Bcast:10.1.2.255 Mask:255.255.255.0 > UP BROADCAST MULTICAST MTU:2044 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:128 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > xenbr0 Link encap:Ethernet HWaddr 00:30:48:CC:5C:A4 > inet addr:10.1.1.2 Bcast:10.1.1.255 Mask:255.255.255.0 > inet6 addr: fe80::230:48ff:fecc:5ca4/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:25892 errors:0 dropped:0 overruns:0 frame:0 > TX packets:10538 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:3586527 (3.4 MiB) TX bytes:1432868 (1.3 MiB) > The ifconfig command reports the wrong (or truncated) MAC address for > the ib0 device. The real address can be found using other commands: > [root@hypoxcp1 ~]# cat /sys/class/net/ib0/address > 80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 > [root@hypoxcp1 ~]# ip link show ib0 > 4: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast qlen > 128 > link/infiniband > 80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 brd > 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff > As mentioned earlier in this thread I've had issues with duplicate MAC > addresses in /etc/ovs-vswitchd.conf, but a clean install somehow fixed > that issue, so the proper MAC address is now added to the file: > [root@hypoxcp1 ~]# cat /etc/ovs-vswitchd.conf > > bridge.brib0.mac=80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 > bridge.brib0.port=brib0 > bridge.brib0.port=ib0 > bridge.brib0.port=vif1.2 > bridge.brib0.xs-network-uuids=6455dd7f-4a61-43b8-a49d-656f749c4ac6 > bridge.xenbr0.mac=00:30:48:cc:5c:a4 > bridge.xenbr0.port=eth0 > bridge.xenbr0.port=vif1.1 > bridge.xenbr0.port=xenbr0 > bridge.xenbr0.xs-network-uuids=528d85a4-f582-c181-54eb-acf09ac7dcf4 > bridge.xenbr1.mac=00:30:48:cc:5c:a5 > bridge.xenbr1.port=eth1 > bridge.xenbr1.port=vif1.0 > bridge.xenbr1.port=xenbr1 > bridge.xenbr1.xs-network-uuids=4f033ff5-5a56-629c-1c27-0765ba7c03bb > I'm no expert on XCP and Open vSwitch, but I believe it works something > like this: > > 1. XAPI writes /etc/ovs-vswitchd.conf based on the XCP DB > 2. XAPI starts up Open vSwitch > 3. Open vSwitch creates the interfaces defined > in /etc/ovs-vswitchd.conf > > To me it seems like the MAC address for the brib0 interface > is truncated, and I believe this causes Open vSwitch to not bind brib0 > and ib0 together: > [root@hypoxcp1 ~]# ovs-ofctl show brib0 > Apr 14 15:38:31|00001|ofctl|INFO|connecting to unix:/var/run/brib0.mgmt > features_reply (xid=0x6bb27f3f): ver:0x97, dpid:32f493d6e290 > n_tables:2, n_buffers:256 > features: capabilities:0x17, actions:0x3ff > LOCAL(brib0): addr:80:00:00:48:fe:80, config: 0, state:0 > Apr 14 15:38:31|00002|ofctl|INFO|connecting to unix:/var/run/brib0.mgmt > get_config_reply (xid=0x9b99aaf1): miss_send_len=0 > [root@hypoxcp1 ~]# ovs-ofctl show xenbr0 > Apr 14 15:38:19|00001|ofctl|INFO|connecting to unix:/var/run/xenbr0.mgmt > features_reply (xid=0x836b0867): ver:0x97, dpid:f68bde598f51 > n_tables:2, n_buffers:256 > features: capabilities:0x17, actions:0x3ff > 1(eth0): addr:00:30:48:cc:5c:a4, config: 0, state:0 > current: 1GB-FD COPPER AUTO_NEG > advertised: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-FD COPPER > AUTO_NEG > supported: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-FD COPPER > AUTO_NEG > LOCAL(xenbr0): addr:00:30:48:cc:5c:a4, config: 0, state:0 > Apr 14 15:38:19|00002|ofctl|INFO|connecting to unix:/var/run/xenbr0.mgmt > get_config_reply (xid=0x2b665ea): miss_send_len=0 > As you see the binding to ib0 is missing, and the MAC of brib0 is > different from that in /etc/ovs-vswitchd.conf. > As previously stated I can communicate between XCP hosts on both brib0 > and ib0 using this setup. The problem is that VIFs on the brib0 network > are not reachable. I have the following IB interfaces on a single host: > ib0 - [2]10.1.2.2/24 > brib0 - [3]10.1.2.102/24 > vif1.2 - [4]10.1.2.202/24 > From within the VM that uses vif1.3 I try to ping brib0 and ib0 and > watch the traffic on the XCP host: > [root@hypoxcp1 ~]# tcpdump -i vif1.2 > tcpdump: WARNING: vif1.2: no IPv4 address assigned > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on vif1.2, link-type EN10MB (Ethernet), capture size 96 bytes > 15:54:59.948660 arp who-has 10.1.2.102 tell 10.1.2.202 > 15:55:00.948643 arp who-has 10.1.2.102 tell 10.1.2.202 > 15:55:01.948645 arp who-has 10.1.2.102 tell 10.1.2.202 > [root@hypoxcp1 ~]# tcpdump -i brib0 > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on brib0, link-type EN10MB (Ethernet), capture size 96 bytes > 15:54:22.612723 arp who-has 10.1.2.102 tell 10.1.2.202 > 15:54:23.612643 arp who-has 10.1.2.102 tell 10.1.2.202 > 15:54:24.612642 arp who-has 10.1.2.102 tell 10.1.2.202 > [root@hypoxcp1 ~]# tcpdump -i ib0 > tcpdump: WARNING: arptype 32 not supported by libpcap - falling back to > cooked socket > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on ib0, link-type LINUX_SLL (Linux cooked), capture size 96 > bytes > The packets never reach ib0. > This setup adds the follow IP routes to the XCP host: > [root@hypoxcp1 ~]# route -n > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use > Iface > 10.1.1.0 0.0.0.0 255.255.255.0 U 0 0 > 0 xenbr0 > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 ib0 > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 brib0 > 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 > 0 brib0 > 0.0.0.0 10.1.1.1 0.0.0.0 UG 0 > 0 0 xenbr0 > If I remove the ib0 route I can talk to brib0 and ib0 from vif1.3, but > only on the same physical machine. Inter-host and inter-vm over network > communication breaks without that route. > I also tried using "bridge" networking instead of "vswitch", but the > system behaves the same way AFAICT, though the configuration is of > course different. > I'm not sure what to try next. I could use the IB network for the > management interface and not run any VMs on it, but please let me know > if you have any idea what's wrong. > Thanks! > Trygve > On Thu, Apr 8, 2010 at 11:40 AM, Trygve Sanne Hardersen > <[5]trygve@xxxxxxxxxxxxx> wrote: > > Hi > Yes, I believe the packets are lost between brib0 and ib0, so they are > never sent across the network but it works on a single host. > I'll do some more testing and let you know what I find. > Thanks! > Trygve > > On Thu, Apr 8, 2010 at 10:40 AM, Dave Scott > <[6]Dave.Scott@xxxxxxxxxxxxx> wrote: > > Hi, > > > > Is it true that you have managed to get VM <-> Host connectivity > working but not VM <-> VM (across host) connectivity working? > > > > If so then it would be interesting to use something like tcpdump to > find out where the packets are going missing. If they*re entering > the vswitch and then getting lost then it would be worth talking > about this on the openvswitch mailing list. > > > > Another possibility is to revert to non-vswitch based networking in > dom0: try writing *bridge* to /etc/xensource/network.conf and > rebooting. > > > > Cheers, > > Dave > > > > From: [7]xen-users-bounces@xxxxxxxxxxxxxxxxxxx > [mailto:[8]xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of > Trygve Sanne Hardersen > Sent: 07 April 2010 23:02 > To: Xen > Subject: [Xen-users] using ipoib with xcp > > > > Hello, > > > > I have been playing with the XCP for a while now, and must say I'm > very exited about the technology. I had no prior experience with Xen > so it has taken me a while to understand the concepts, but now I > feel most important issues are solved and I've purchased some > hardware to build my (tiny) cloud on. > > > > The box is a Supermicro 1026TT-IBXF, so I have 2 x Ethernet and 1 x > Infiniband (IB) NICs per node. I want to use the IB NIC to provide > fast connectivity between the domUs, while the Ethernet NICs will be > used for the XCP management interface and ISP connectivity. > > > > I have successfully built OFED 1.5.1 in the XCP DDK VM and > installed OFED in the XPC 0.1.1 dom0. From there I can bring up the > IB network, but I'm having problems getting this to work properly > within XCP virtual machines. This is what happens: > > > > Starting out I have 2 nodes in a pool; both are clean with only lo, > eth0/xenbr0 and eth1/xenbr1 configured. I run the following commands > to add the IB NICs to the pool: > > > > xe pif-scan host-uuid=NODE1 > > xe pif-plug uuid=NODE1_IB0 > > xe pif-scan host-uuid=NODE2 > > xe pif-plug uuid=NODE2_IB0 > > > > As expected this adds ib0/brib0 on both nodes and a single pool-wide > network, but there is no connectivity between the hosts after I give > brib0 an IP: > > > > xe pif-reconfigure-ip uuid=NODE1_IB0 IP=10.1.2.2 > netmask=255.255.255.0 mode=static > > xe pif-reconfigure-ip uuid=NODE2_IB0 IP=10.1.2.3 > netmask=255.255.255.0 mode=static > > ping 10.1.2.2 --> reply > > ping 10.1.2.3 --> destination host unavailable > > > > However if I also give ib0 an IP and use this as gateway for brib0, > connectivity is achieved: > > > > ifconfig ib0 10.1.2.22 netmask 255.255.255.0 > > xe pif-reconfigure-ip uuid= NODE1_IB0 IP=10.1.2.2 > netmask=255.255.255.0 gateway=10.1.2.22 mode=static > > ifconfig ib0 10.1.2.33 netmask 255.255.255.0 > > xe pif-reconfigure-ip uuid= NODE2_IB0 IP=10.1.2.3 > netmask=255.255.255.0 gateway=10.1.2.33 mode=static > > ping 10.1.2.2 --> reply > > ping 10.1.2.3 --> reply > > ping 10.1.2.22 --> reply > > ping 10.1.2.33 --> reply > > > > This is very well, but when I add a VIF on the IB network to a VM it > is not able to communicate through it: > > > > xe vif-create device=2 mac=random network-uuid=IB_NET > vm-uuid=NODE1_IBVM > > ifconfig eth2 10.1.2.122 netmask 255.255.255.0 > > xe vif-create device=2 mac=random network-uuid=IB_NET > vm-uuid=NODE2_IBVM > > ifconfig eth2 10.1.2.133 netmask 255.255.255.0 > > ping 10.1.2.122 --> reply > > ping 10.1.2.22 --> destination host unavailable > > ping 10.1.2.2 --> destination host unavailable > > ping 10.1.2.133 --> destination host unavailable > > ping 10.1.2.33 --> destination host unavailable > > ping 10.1.2.3 --> destination host unavailable > > > > I believe that the problem lies somewhere in the routing table > configuration. This setup gives the following routing table: > > > > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 ib0 > > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 brib0 > > > > If I delete and then add the brib0 route, the route order is > changed: > > > > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 brib0 > > 10.1.2.0 0.0.0.0 255.255.255.0 U 0 0 > 0 ib0 > > > > Using this the VM can talk to the host (and visa versa), but hot > across the network. Connectivity between ib0/brib0 over the network > is also broken. > > > > I've also noticed that the same MAC is added to > /etc/ovs-vswitchd.conf multiple times for brib0: > > > > > bridge.brib0.mac=80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 > > > bridge.brib0.mac=80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 > > > bridge.brib0.mac=80:00:00:48:fe:80:00:00:00:00:00:00:00:30:48:ff:ff:cc:0b:25 > > > > I've tried removing some of these but that does not seem to have any > effect. My experience with IP routing and especially vswitch is > limited and I'm not sure what to try from here. I've tried various > configurations but no luck so far. > > > > Note that I'm testing with 2 XCP nodes configured in a pool. I've > also checked that the PIFs are in the same order on both nodes (the > reference mentions this). The MTU (1500) of brib0 differs from that > of ib0 (2044), but changing this does not solve the problem. > > > > Any help is much appreciated. Thanks! > > > > Trygve > > -- > HypoBytes Ltd. > Trygve Sanne Hardersen > Akersveien 24F > 0177 Oslo > Norway > > [9]hypobytes.com > +47 40 55 30 25 > > -- > HypoBytes Ltd. > Trygve Sanne Hardersen > Akersveien 24F > 0177 Oslo > Norway > > [10]hypobytes.com > +47 40 55 30 25 > > -- > HypoBytes Ltd. > Trygve Sanne Hardersen > Akersveien 24F > 0177 Oslo > Norway > > [11]hypobytes.com > +47 40 55 30 25 > > -- > HypoBytes Ltd. > Trygve Sanne Hardersen > Akersveien 24F > 0177 Oslo > Norway > > [12]hypobytes.com > +47 40 55 30 25 > > References > > Visible links > 1. mailto:trygve@xxxxxxxxxxxxx > 2. http://10.1.2.2/24 > 3. http://10.1.2.102/24 > 4. http://10.1.2.202/24 > 5. mailto:trygve@xxxxxxxxxxxxx > 6. mailto:Dave.Scott@xxxxxxxxxxxxx > 7. mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx > 8. mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx > 9. http://hypobytes.com/ > 10. http://hypobytes.com/ > 11. http://hypobytes.com/ > 12. http://hypobytes.com/ > _______________________________________________ > Xen-users mailing list > Xen-users@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |