[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x217/0x220



On 31/05/17 00:18, Boris Ostrovsky wrote:
> On 05/30/2017 06:27 AM, Steven Haigh wrote:
>> Just wanted to give this a nudge to try and get some suggestions on
>> where to go / what to do about this.
>>
>> On 28/05/17 09:44, Steven Haigh wrote:
>>> The last couple of days running on kernel 4.9.29 and 4.9.30 with Xen
>>> 4.9.0-rc6 I've had a number of ethernet lock ups that have taken my
>>> system off the network.
>>>
>>> This is a new development - but I'm not sure if its kernel or xen related.
> 
> Since noone seems to have seen this it would be useful to narrow it down
> a bit.
> 
> Do you observe this on rc5? Or with 4.9.28 kernel? Any particular load
> that you are using? Do you see this on a specific NIC?

This install is currently using xen 4.9-rc7 and kernel 4.9.30. I would
say that there may be a connection between occurrences between disk
activity and the ethernet adapter locking up - but I haven't been able
to prove this in any valid way yet.

I am currently running this script on the server in question to try and
get a log of how often the adapter locks up. I only added the logger
line tonight - so I don't have a great deal of historical data to add as
yet.

#!/bin/bash
while true; do
        ping -c1 10.1.1.2 >& /dev/null
        if [ $? != 0 ]; then
                logger 'No response. Resetting enp5s0'
                mii-tool -R enp5s0
        fi
        sleep 5
done

What I have right now in dmesg + journalctl is:
# dmesg
[221834.898685] r8169 0000:05:00.0 enp5s0: link down
[221834.898768] br10: port 1(vlan10) entered disabled state
[221834.898827] br203: port 1(vlan203) entered disabled state
[221834.905380] r8169 0000:05:00.0 enp5s0: link up
[221834.905748] br10: port 1(vlan10) entered blocking state
[221834.905749] br10: port 1(vlan10) entered forwarding state
[221834.906162] br203: port 1(vlan203) entered blocking state
[221834.906162] br203: port 1(vlan203) entered forwarding state
[221834.906176] r8169 0000:05:00.0 enp5s0: link down
[221835.949483] br10: port 1(vlan10) entered disabled state
[221835.949515] br203: port 1(vlan203) entered disabled state
[221838.069998] r8169 0000:05:00.0 enp5s0: link up
[221838.070538] br10: port 1(vlan10) entered blocking state
[221838.070540] br10: port 1(vlan10) entered forwarding state
[221838.071055] br203: port 1(vlan203) entered blocking state
[221838.071057] br203: port 1(vlan203) entered forwarding state

# journalctl | grep Resetting
May 31 00:20:10 xenhost: No response. Resetting enp5s0

> Have you checked hypervisor log (xl dmesg)?

The last lines I see in 'xl dmesg' are:
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) .................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
input to Xen)
(XEN) Freed 456kB init memory

This would indicate that nothing additional is being logged here.

If it matters, the xl info follows:
# xl info
host                   : xenhost
release                : 4.9.30-1.el7xen.x86_64
version                : #1 SMP Fri May 26 06:16:37 AEST 2017
machine                : x86_64
nr_cpus                : 4
max_cpu_id             : 3
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 1
cpu_mhz                : 3303
hw_caps                :
bfebfbff:179ae3bf:28100800:00000001:00000001:00000000:00000000:00000100
virt_caps              : hvm
total_memory           : 16308
free_memory            : 1785
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 9
xen_extra              : -rc
xen_version            : 4.9-rc
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit2
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          :
xen_commandline        : placeholder dom0_mem=2048M cpufreq=xen
dom0_max_vcpus=1 dom0_vcpus_pin sched=credit2 console=tty0 console=com1
com1=115200,8n1
cc_compiler            : gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
cc_compile_by          : mockbuild
cc_compile_domain      : crc.id.au
cc_compile_date        : Sun May 28 10:08:40 AEST 2017
build_id               : 0848a8631a9064b3de53cdfe71c996e929ce2539
xend_config_format     : 4

-- 
Steven Haigh

Email: netwiz@xxxxxxxxx
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.