[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] network misbehaviour with gplpv and 2.6.30
On Tue, Jul 21, 2009 at 11:13 AM, Paul Durrant<paul.durrant@xxxxxxxxxx> wrote: > James Harper wrote: >> >>> Are you saying that ring slot n >>> has only NETRXF_extra_info and *not* NETRXF_more_data? >>> >> >> Yes. From the debug I have received from Andrew Lyon, NETRXF_more_data >> is _never_ set. >> >> From what Andrew tells me (and it's not unlikely that I misunderstood), >> the packets in question come from a physical machine external to the >> machine running xen. I can't quite understand how that could be as they >> are 'large' packets (>1514 byte total packet length) which should only >> be locally originated. Unless he's running with jumbo frames (are you >> Andrew?). >> > > It's not unusual for h/w drivers to support 'LRO', i.e. they re-assemble > consecutive in-order TCP segments into a large packet before passing up the > stack. I believe that these would manifest themselves as TSOs coming into > the transmit side of netback, just as locally originated large packets > would. > >> I've asked for some more debug info but he's in a different timezone to >> me and probably isn't awake yet. I'm less and less inclined to think >> that this is actually a problem with GPLPV and more a problem with >> netback (or a physical network driver) in 2.6.30, but a tcpdump in Dom0, >> HVM without GPLPV and maybe in a Linux DomU should tell us more. >> > > Yes, a tcpdump of what's being passed into netback in dom0 should tell us > what's happening. > > Paul > I did more testing including running various wireshark captures which James looked at, the problem is not the gplpv drivers as it also affects the linux pv netfront driver, it seems to be a dom0 problem, packets arrive with frame.len < 72 but ip.len > 72 which of course causes terrible throughput in domU networking, and also crashed the gplpv drivers until James added a check for the condition (see http://xenbits.xensource.com/ext/win-pvdrivers.hg?rev/0436238bcda5), now it triggers a warning message, for example: XenNet XN_HDR_SIZE + ip4_length (2974) > total_length (54) Yesterday I noticed something quite interesting, if I switch off receive checksum offloading on the dom0 nic (ethtool -K peth0 rx off) the network performance in domU is much improved, but something is still wrong because some network performance tests are still very slow, and a different warning message is triggered in the Xennet driver: XenNet Size Mismatch 54 (ip4_length + XN_HDR_SIZE) != 60 (total_length) Now the really strange thing is that if I re-enable rx checksum offload (ethtool -K peth0 rx on) everything works perfectly, networking throughput is the same as with 2.6.29 and no warning messages are triggered in the Xennet driver. The dom0 NIC is a 82575EB, I have tried using both the 1.3.16-k2 driver which is included in 2.6.30, and the 1.3.19.3 which I downloaded from Intel's support site, I will try another nic if I can find one. I don't understand how toggling rx offload off and on can fix the problem but it does. Andy _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |