[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Failure in Xen network PV driver when acceleration enabled



Hi,

I'm experiencing a failure in DomU networking, seeing messages such as
"net eth0: rx->offset: 0, size: 4294967295" in the DomU.

The setup is:
 - Dom0 running recent 3.6-rc3
 - DomU 1 running a Ceph cluster
 - DomU 2 running of a block device hosted by that ceph cluster.

So basically when DomU 2 makes a block device request, the path is (I think) :

 - DomU 2 Frontend block driver
 - Dom0 Back driver
 - Dom0 RBD Ceph block driver
 - Dom0 Kernel TCP connection to DomU 1
 - Dom0 DomU 1 Backend network driver
 - DomU 1 Frontend network driver

And I'm seeing those "net eth0: rx->offset: 0, size: 4294967295"
messages in DomU1 dmesg. DomU2 doesn't finish booting at all.
>From what I can see in the Ceph logs, it seems that the DomU 1
receives corrupted messages.

I've been digging a bit and it seems issuing a

ethtool -K vif1.0 tx off

in the dom0 prevents the issue. (vif1.0 being the DomU1 virtual
network interface)

Note that it needs to be in the dom0 on the VIF and not in the domU on
the eth0 interface like I originally tried. It's also not enough to
disable gso and/or tso, you need to turn off all tx accel.

I originally reported this in xen-user and ceph-devel but now that the
failure has been narrowed to a Xen PV net bug seems more appropriate.
Thread archive available at
http://lists.xen.org/archives/html/xen-users/2012-08/msg00321.html


Cheers,

      Sylvain

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.