[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH v2 0/2] Add xen-crashd.
On 11/29/13 05:26, Ian Campbell wrote:
On Fri, 2013-11-15 at 14:20 -0500, Don Slutz wrote:
Ian Campbell:
Add 1st pass on some documention on crash's remote protocol.
My concern with this was that we were using some sort of internal crash
protocol which has no ABI stability guarantees etc. Documenting it in
the Xen tree doesn't really do anything to alleviate that concern. It
should be a protocol which is published by the crash folks not us.
I have no issues with this. The only documentation I can find is:
 http://people.redhat.com/anderson/crash_whitepaper/
Ideally they would agree to some sort of protocol stability level, or
maybe you can show that the protocol had inbuilt backward and forward
compatibility capabilities already?
It may not have the best backwards and forwards compatibility that
could be designed. However so far I have been able to add features
to a newer crash that have no issues with older "crashd" servers.
And older crash code works fine with the newer "crashd" servers.Â
This is not the 1st one of these I have coded, just the 1st that I
can release.
Even more concerning is [0] where one of the crash maintainers says:
It's been deprecated for almost 10 years now. I don't understand how
you have been able to even get it to build, never mind work as the mail
thread indicates?
We surely don't want to be adding code which relies on a protocol which
has been deprecated for 10 years!
The main reason that I know of is that crash in active mode (i.e.
running live on machine A), is just so much simpler to use that
using a remote crash on machine B talking to a crashd on machine A.Â
This is because the crashd on machine A is in "live" mode. This
means that slow or unresponsive systems cannot be examined using the
remote protocol. And keeping the right kernel versions on machine B
that you need is just overhead.
With all this in mind, I was not surprised that it had been
deprecated for 10 years. However with Xen in the mix, the machine A
no longer needs to be active to run "crashd", in fact it can be
paused, or running, or crashed, or shutdown, etc.
Daniel K asked about gdbsx -- can that not speak to crash somehow?
It is clearly possible to write a remote crash to remote gdb server,
but needing to run 2 servers to connect up crash is to me too
complex. I could also embed the xen-crashd code in gdbsx by adding
command line options. However very little code would be shared.Â
Since I based xen-crashd off of xenctx, it currently uses libxc
calls. gdbsx uses ioctl() directly to do the hyper calls. It does
not appear to support physical addresses. It does not appear to
support virtual address to physical address conversion. Quoteing
from the crash whitepaper:
Furthermore, to examine the contents of a live system's
kernel internals from user space, the only readily available
option has been to use gdb on /proc/kcore.
While gdb is an incredibly powerful tool, it is designed
to
debug user programs, and is not at all "kernel-aware".
Consequently, using gdb
alone has limited usefulness when looking at kernel memory,
essentially constrained to the printing
of kernel data structures if the vmlinux
file was built with
the -g C flag, the disassembly of kernel text, and raw
data dumps.
Or
run on /proc/vmcore directly, or be extended to do so?
There is no /proc/vmcore in this case. Extending dom0 linux to
provide /proc/1/vmcore, /proc/2/vmcore, etc. (I.E.
/proc/<domid>/vmcore) would be a big change and designing a
security model for these would also not be quick.
Maybe this will help:
[root@dcs-xen-54 tmp]# xl list
NameÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ IDÂÂ Mem
VCPUsÂÂÂÂÂ StateÂÂ Time(s)
Domain-0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 0Â 2048ÂÂÂÂ
8ÂÂÂÂ r-----ÂÂÂ 3928.9
P-1-0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1Â 3080ÂÂÂÂ
1ÂÂÂÂ -b----ÂÂÂÂÂ 18.0
[root@dcs-xen-54 tmp]# /usr/lib/xen/bin/xen-crashd
1&
[1] 1447
[root@dcs-xen-54 tmp]#Â 2 Dec 13 11:38:01.042 socket
ready on port 5001 after 1 bind call
[root@dcs-xen-54 tmp]# crash --machdep phys_base=0x200000
localhost:5001 /usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux
crash 6.1.4
Copyright (C) 2002-2013Â Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010Â IBM Corporation
Copyright (C) 1999-2006Â Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012Â Fujitsu Limited
Copyright (C) 2006, 2007Â VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011Â NEC Corporation
Copyright (C) 1999, 2002, 2007Â Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002Â Mission Critical
Linux, Inc.
This program is free software, covered by the GNU General
Public License,
and you are welcome to change it and/or distribute copies
of it under
certain conditions. Enter "help copying" to see the
conditions.
This program has absolutely no warranty. Enter "help
warranty" for details.
Â
Â2 Dec 13 11:38:08.917 Accepted a connection.
WARNING: daemon cannot access /proc/version
NOTE: setting phys_base to: 0x200000
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and
redistribute it.
There is NO WARRANTY, to the extent permitted by law.Â
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
ÂÂÂÂÂ KERNEL:
/usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux
 DUMPFILE: /dev/mem@localhost (remote live system)
ÂÂÂÂÂÂÂ CPUS: 1
 DATE: Mon Dec 2 11:37:02 2013
ÂÂÂÂÂ UPTIME: 00:33:11
LOAD AVERAGE: 0.01, 0.00, 0.00
ÂÂÂÂÂÂ TASKS: 81
ÂÂÂ NODENAME: P-1-0.TC5.CloudSwitch.com
ÂÂÂÂ RELEASE: 2.6.18-128.el5
ÂÂÂÂ VERSION: #1 SMP Wed Jan 21 10:41:14 EST 2009
ÂÂÂÂ MACHINE: x86_64Â (2400 Mhz)
ÂÂÂÂÂ MEMORY: 3 GB
ÂÂÂÂÂÂÂÂ PID: 0
ÂÂÂÂ COMMAND: "swapper"
ÂÂÂÂÂÂÂ TASK: ffffffff802eeae0Â [THREAD_INFO:
ffffffff803dc000]
ÂÂÂÂÂÂÂÂ CPU: 0
ÂÂÂÂÂÂ STATE: TASK_RUNNING (ACTIVE)
crash> net
ÂÂ NET_DEVICEÂÂÂÂ NAMEÂÂ IP ADDRESS(ES)
ffffffff80321e80Â loÂÂÂÂ 127.0.0.1
ffff8100babd9000Â eth1ÂÂ 172.16.64.65
ffff8100b6c96000Â sit0ÂÂ
crash> q
[1]+Â DoneÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ /usr/lib/xen/bin/xen-crashd
1
Is almost the same as:
[root@dcs-xen-54 tmp]# xl dump-core 1 p-1-0.vmore
[root@dcs-xen-54 tmp]# crash p-1-0.vmore
/usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinuxÂÂÂÂÂÂ
crash 6.1.4
Copyright (C) 2002-2013Â Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010Â IBM Corporation
Copyright (C) 1999-2006Â Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012Â Fujitsu Limited
Copyright (C) 2006, 2007Â VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011Â NEC Corporation
Copyright (C) 1999, 2002, 2007Â Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002Â Mission Critical
Linux, Inc.
This program is free software, covered by the GNU General
Public License,
and you are welcome to change it and/or distribute copies
of it under
certain conditions. Enter "help copying" to see the
conditions.
This program has absolutely no warranty. Enter "help
warranty" for details.
Â
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and
redistribute it.
There is NO WARRANTY, to the extent permitted by law.Â
Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
ÂÂÂÂÂ KERNEL:
/usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux
ÂÂÂ DUMPFILE: p-1-0.vmore
ÂÂÂÂÂÂÂ CPUS: 1
 DATE: Mon Dec 2 11:05:09 2013
ÂÂÂÂÂ UPTIME: 00:01:18
LOAD AVERAGE: 2.00, 0.70, 0.24
ÂÂÂÂÂÂ TASKS: 81
ÂÂÂ NODENAME: P-1-0.TC5.CloudSwitch.com
ÂÂÂÂ RELEASE: 2.6.18-128.el5
ÂÂÂÂ VERSION: #1 SMP Wed Jan 21 10:41:14 EST 2009
ÂÂÂÂ MACHINE: x86_64Â (2400 Mhz)
ÂÂÂÂÂ MEMORY: 3 GB
ÂÂÂÂÂÂ PANIC: ""
ÂÂÂÂÂÂÂÂ PID: 0
ÂÂÂÂ COMMAND: "swapper"
ÂÂÂÂÂÂÂ TASK: ffffffff802eeae0Â [THREAD_INFO:
ffffffff803dc000]
ÂÂÂÂÂÂÂÂ CPU: 0
ÂÂÂÂÂÂ STATE: TASK_RUNNING (ACTIVE)
ÂÂÂÂ WARNING: panic task not found
crash> net
ÂÂ NET_DEVICEÂÂÂÂ NAMEÂÂ IP ADDRESS(ES)
ffffffff80321e80Â loÂÂÂÂ 127.0.0.1
ffff8100babd9000Â eth1ÂÂ 172.16.64.65
ffff8100b6c96000Â sit0ÂÂ
crash> quit
With the changes in crash 7.0.4 (yet to be released), crash can be
invoked in a remote "not live" mode, which is how it runs on a
vmcore file.
So if a DomU is paused, "xl dump-core;crash" and
"xen-crashd;crash" will give the exact same answers in a lot
less real time (xen-crashd case).
ÂÂ -Don Slutz
Ian.
[0]
http://thread.gmane.org/gmane.linux.kernel.crash-dump.crash-utility/4714/focus=4736
|
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|