[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-API] [XCP-1.1] High OVS cpu load and unresponsive host network while VMPR archive phase is running



We notice high vswitch cpu load while the vm protection archive phase is 
running, which ends up in broken network connections and unresponsive pool 
servers. Any help to solve this problem is welcome.

XCP build: 1.1.0-50674c
OVS build: 1.4.2
NICs: BCM5709 Gigabit TOE iSCSI Offload
OVS NIC bonding: active/active
Pool Nodes: Dell R610
Storage type: LVMoiSCSI

The archive phase starts at 03.00AM, short time after that OVS logs poll_loop 
events and high CPU usage, after some hours (3-4) the whole host network 
becomes unresponsive, except the offloaded iSCSI connections to the NetAPP 
guest 
system image LUN (bnx2i cnic). We snapshot and archive only guest system 
images (mostly 8GB per image), data volumes are mounted directly by guest VMs 
(iSCSI).

We had running an XCP-1.0 pool on Intel Servers for the last two years with a 
lot of VLAN trunks, active/active bonds, cheep switches, self made DRBD-
replicated storage, and OVS-1.0.1 IIRC. We've never seen such behavior.

Thanks
Christian








_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.