[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NFS issue with xenserver 6.2



On 14/03/14 01:25, Umair Azam wrote:
Hi Zoli,

When nfs time out log entries appear i am able to ping storage machine
which remains up with almost no load. however i have noticed according
to xentop xenserver loads goes up to 70% (3 vcpus are allocated core2
duo machine, 1 GB ram to dom 0) and secondary storage VM of cloudstack
cpu goes up to 160%, The problem arises when cloudstack tries to launch
Secondary storage VM on hypervisor at that time "nfs server not
responding, timed out" log entries begin to appear on xenserver and then
machine reboots itself (might be thats due to HA enabled).
So the Dom0->NFS connection goes down when you start up the secondary storage VM, right? Does this secondary storage VM access the same NFS? Where is the disk of this VM stored? What is stored on that NFS btw?


I have replaced the ethernet cables, switch, NIC's but still facing this
strange issue. I am unable to figure out why this problem arises. I have
also seen the following entries in logs appearing many times.

Mar 14 06:04:30 xenserver-1 scripts-vif: Called as "add vif" domid:2
devid:0 mode:bridge
Mar 14 06:04:30 xenserver-1 scripts-vif: Called as "online vif" domid:2
devid:0 mode:bridge
Mar 14 06:04:30 xenserver-1 scripts-vif: Setting vif2.0 MTU 1500
Mar 14 06:04:30 xenserver-1 scripts-vif: Adding vif2.0 to xapi0 with
address fe:ff:ff:ff:ff:ff
Mar 14 06:04:30 xenserver-1 scripts-vif: Failed to ip link set vif2.0
address fe:ff:ff:ff:ff:ff
Mar 14 06:04:30 xenserver-1 kernel: [ 2890.509223] device vif2.0 entered
promiscuous mode
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - Called with
vif_type=vif, domid=2, devid=0, network_mode=bridge, action=filter
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - attempting to acquire
lock /var/lock/ebtables.lock
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - acquired lock
/var/lock/ebtables.lock
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - ['/sbin/ip', 'link',
'set', 'vif2.0', 'down']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - ['/sbin/ebtables', '-L',
'FORWARD_vif2.0']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] -
['/usr/bin/xenstore-read', '/local/domain/0/backend/vif/2/0/mac']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] -
['/usr/bin/xenstore-read', '/xapi/2/private/vif/0/locking-mode']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] -
['/usr/bin/xenstore-read', '/xapi/2/private/vif/0/ipv4-allowed']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] -
['/usr/bin/xenstore-read', '/xapi/2/private/vif/0/ipv6-allowed']
Mar 14 06:04:30 xenserver-1 python:
/opt/xensource/libexec/setup-vif-rules[23804] - Got locking config:
MAC=0e:00:a9:fe:00:68; locking_mode=unlocked; ipv4_allowed=;
ipv6_                           allowed=
These log entries looks normal, happens when you create a vif. This one seems to have port locking.




Umair Azam

On 3/11/2014 8:36 PM, Zoltan Kiss wrote:
On 11/03/14 00:39, Umair Azam wrote:
Hi,

I am using xenserver 6.2 and facing nfs timed out issue, this issue has
been mentioned in 6.0 release notes but why i m facing this issue in
latest release (6.2)

Mar 11 02:49:05 xenserver-1 kernel: [ 1848.148548] nfs: server
10.11.17.33 not responding, timed out

  * In some 10 Gigabit Ethernet environments, occasional performance
    problems with disk throughput on NFS SRs have been observed. The
    problem can be identified by a log entry in/var/log/messagessimilar
    to:kernel: nfs: server 10.0.0.1 not responding, timed out. Citrix
    continues to investigate this issue with an aim to resolve it in a
    future release. [CA-59187]

http://support.citrix.com/article/CTX130418

That problem were solved a long time ago, this is probably something
different. If reproducible, you should check why the host lose
connection with the NFS server. Things to check:
- can you ping its IP?
- what is the load? top, xentop, "watch -n 1 ovs-dpctl show" can be
useful here, the latter shows how many network flows you have at one
time in OVS. Rapid increase (ie more than a hundred per second) in
"missed: " shows lots of connections going around
- "ovs-dpctl dump-flows <bridgename>" shows the actual flows, you can
actually see if there is a flow entry for that traffic

I can't comment on how to debug on the storage manager side, but
previous ones could be useful.

Zoli





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.