[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] Slow TCP performance between Windows Vista andXenPV-on-HVM guest



> Subject: RE: [Xen-users] Slow TCP performance between Windows Vista
> andXenPV-on-HVM guest
> 
> I have always had to disable LSO in my setups (Although the other
> acceleration features work fine). Could it possibly be a NIC driver
> issue in Dom0?
> 
> 
> Rob
> 
> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
> [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Fischer,
> Anna
> Sent: 12 June 2010 04:40
> To: James Harper; xen-users@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-users] Slow TCP performance between Windows Vista
> andXenPV-on-HVM guest
> 
> > Subject: RE: [Xen-users] Slow TCP performance between Windows Vista
> and
> > XenPV-on-HVM guest
> >
> > > > Are you capturing packets on the windows machine or on the Dom0?
> > >
> > > Dom0. Note that the Windows machine doesn't even run Xen or
> anything,
> > it is
> > > just some random machine on the network. Only the Linux guest runs
> on
> > Xen.
> >
> > Yes, I'd figured that.
> >
> > >
> > >
> > > > If you are using tcpdump on dom0, make sure you use '-s0' so that
> > you
> > > > capture the entire packup, and possibly '-v' as well. Without
> > capturing
> > > > the entire packet, tcpdump can't tell you if the checksum is
> correct
> > or
> > > > not. Even if the checksum is incorrect on Dom0 it doesn't
> > necessarily
> > > > tell you that there is a problem though. A bad checksum on
> received
> > > > packets on the windows machine would definitely suggest a problem
> > > > though.
> > >
> > > I capture with Ethereal. I definitely catch all packet. If this was
> a
> > checksum
> > > problem, then communication wouldn't work at all. However, SSH and
> > other
> > > (slower) connections work just fine. The problem is only on bulk
> data
> > transfer
> > > using TCP. If the Linux guest was sending a packet with an invalid
> > checksum,
> > > then the Windows guest would *never* send out the ACK. However, it
> is
> > actually
> > > sending out the ACK, but only after the retransmit, to ACK the
> > *retransmitted*
> > > packet. If this was a checksum problem, then the retransmitted
> packet
> > would
> > > also have an invalid checksum and so it would basically never be
> > ACKed.
> > >
> > > I have read about Vista's TCP "auto-tuning" feature, and I wonder if
> > something
> > > like this might be the problem here that the Xen guest cannot cope
> > with?
> > >
> >
> > It might then be a 'large send' problem.
> 
> Yes, my guess was that it must be something like this.
> 
> > That would manifest itself as
> > low volume traffic being mostly okay, but as the throughput increased,
> > >MTU sized packets would be sent from DomU via Dom0, with the intent
> > that the hardware will split them up into <=MTU sized. If those were
> > dropped somewhere then the retransmit would happen, and the retransmit
> > would typically not use the 'large' packet, so it would probably work.
> 
> Is that so? I don't know much about the TCP implementation, but would it
> disable offloading for a retransmit?
> 
> 
> > tcpdump should show >1500 byte packets in Dom0 on the vif interface
> > belonging to the DomU, and in the DomU if this is happening.
> 
> No, I only see < 1500. I capture on the VIF and on the physical device
> in Dom0.
> 
> 
> > Use ethtool in DomU to disable as many offload features as possible
> and
> > see if things improve.
> 
> Hardware offload is disabled on the NIC inside the Linux guest, on the
> VIF in Dom0 and also on the NIC in Dom0. All offload features, including
> checksum offload. My guess was also that this must be the problem, as I
> said before it actually works with exactly the same guest running on
> VMWare. But obviously on VMWare it doesn't run the Xen netfront/netback
> drivers, so my guess was that some configuration on there might be the
> issue. But as I said, switching off hardware offload does not make any
> difference at all. At the moment it does not run any HW offloading.
> 

I have found out by now that the problem is a bug in the NAT function in Dom0, 
e.g. in the TCP connection tracking module. It fails to rewrite Delayed ACK 
packets from newer Windows (Vista / 7) machines. When I configure the Windows 
TCP stack with TCP_NODELAY then it all works.

Anna

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.