|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [XenPPC] Re: libvirt status of bugt racking regarding bad paddr
This is how the adresses are passed through the stack: (read this monospaced)userspace-libvirt linux-kernel Xen alloc/ioctl -> incoming -> xencomm_map -> incoming -> xencomm_inline_to_guest 0xffba2eb0 -> ffba2eb000000000 -> bfba2eb000000000 -> bfba2eb000000000 -> 0x3fba2eb000000000 The xencomm_map function is from our patchqueue for the xen-2.6.18 powerpc merge It uses xencomm_create_inline which xencomm_pa and adds the XENCOMM_INLINE_FLAG flag With the correct 0x80000000 00000000 adresses this works but it fails for the wrong 0xff... address received in the bug scenario. All lower effects are only subsequent effects Since the numeric value of 0xffba2eb0 in userspace is different to the one received by the kernel 0xffba2eb000000000 and its just the upper/lower half that is twisted it may be some kind of little/big endian bug here (0x00000000ffba2eb0 would be right). In libvirt src/xen_internal.c the assignment of the buffer struct is a bit confusing because it relies on global variables through the stack etc.This is how I think it works (or better doesn't) and somewhere here I get lost: -> the identified userspace call causing the bug is xenHypervisorDoV2Sys with hypervisor_version=2 && sys_interface_version=3 -> the identified call used the dominfos->v2d5 to assign it to buffer.v-> difference of v2 and v2d5 : v2d5 uses ALIGN_64 for all 64 bit vars which is sysctl version 3 style-> also the .v above is used to pack buffer in a 64bit aligned stuct for sysctl version 3 -> The dominfo struct use here is assigned in virXen_getdomaininfo and is &(dominfo->v2); or &(dominfo->v0); -> I already debugged this, the critical call comes from xenHypervisorInit Here the local &info struct gets passed to virXen_getdomaininfo which is later accessed with &(dominfo->v2); =>The dominfos->v2d5 should be empty maybe thats why it fails ?1. info of type xen_getdomaininfo is allocated locally in xenHypervisorInit
2. then its passed to virXen_getdomaininfo
2. virXen_getdomaininfo allocates a new local local xen_getdomaininfolist
called dominfos
3. depending on hypervisor_version there the v0 or the v2 part of the
submitted info struct gets assigned (!no v2d5 assignment!)
4. now we should have a xen_getdomaininfolist thats filled with either
"dominfos.v0=&(dominfo->v0);" or "dominfos.v2=&(dominfo->v2);" but is
never assigned an therefore undefined
5. This dominfos struct is passed by reference to virXen_getdomaininfolist
6. depending on hypervisor_version and sys_interface_version the code now
accesses in our bug scenario the v2d5 field (which should be in the
best
case undefined)
7. This op struct with the assingnment from v2d5 as buffer.v now gets
passed
to Xen which later fails to handle that
Because the stack is not cleaned and the call to sysctl version 2/3 is very
close and similar it might be possible that we access accidentially the data
of the v2 call wth the v3 layout and becaue of that our buffer address
may be
shifted 32bits. As a quick test I did the following in virXen_getdomaininfo memset(&dominfos, 0, sizeof(dominfos));-> zeroing local variable should never hurt except something like the described buggy stack usage is around and as expected I get segmentation faults if I zero the
dominfos struct.
The initial approach to fix that is:
if (hypervisor_version < 2) {
dominfos.v0 = &(dominfo->v0);
} else {
if (sys_interface_version < 3)
dominfos.v2 = &(dominfo->v2);
else
dominfos.v2d5 = &(dominfo->v2d5);
}
But I realized that &(dominfo->v2) == &(dominfo->v0) == &(dominfo->v2d5)
which
should all three be subparts of a struct and therefore the adresses
should not
be the same.
union xen_getdomaininfolist {
struct xen_v0_getdomaininfo *v0;
struct xen_v2_getdomaininfo *v2;
struct xen_v2d5_getdomaininfo *v2d5;
};
typedef union xen_getdomaininfolist xen_getdomaininfolist;
Here is the debugging output related to this from gdb and my debugging
statements.
gdb --args /usr/bin/python mytest.py
I may overlook something but I think the addresses should be
different and the "print *op" in xenHypervisorDoV2Sys should not always see
getdomaininfolist and getdomaininfolists3 filled at the same time.
Breakpoint 2 at 0xf671534: file xen_internal.c, line 824.
Pending breakpoint "xenHypervisorDoV2Sys" resolved
Initialize Libvirtmod (sleepwait)
virXen_getdomaininfo - assigning &(dominfo->v2) '0xffb29eb0'
virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on
handle '9'
virXen_getdomaininfolist - sys_interface_version '2' hypervisor_version '2' virXen_getdomaininfolist - dominfos->v2 '0xffb29eb0' dominfos->v2d5 '0xffb29eb0' virXen_getdomaininfolist - sleepwait [Switching to Thread 0xf7fc2000 (LWP 4876)] Breakpoint 2, xenHypervisorDoV2Sys (handle=9, op=0xffb29d00) at xen_internal.c:824 warning: Source file is more recent than executable. 824 memset(&hc, 0, sizeof(hc)); (gdb) print *op$1 = {cmd = 6, interface_version = 0, u = {getdomaininfolist = {first_domain = 0, max_domains = 1, buffer = 0xffb29eb0, num_domains = 1}, getdomaininfolists3 = {first_domain = 0, max_domains = 1, buffer = {v = 0xffb29eb0, pad = 18424963504277553153}, num_domains = 0}, getschedulerid = {sched_id = 0}, padding = {0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 255 '', 178 '', 158 '\236', 176 '', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 0 '\0' <repeats 112 times>}}} (gdb) c Continuing. virXen_getdomaininfo - assigning &(dominfo->v2d5) '0xffb29eb0'virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on handle '9' virXen_getdomaininfolist - sys_interface_version '3' hypervisor_version '2' virXen_getdomaininfolist - dominfos->v2 '0xffb29eb0' dominfos->v2d5 '0xffb29eb0' virXen_getdomaininfolist - sleepwait Breakpoint 2, xenHypervisorDoV2Sys (handle=9, op=0xffb29d00) at xen_internal.c:824 824 memset(&hc, 0, sizeof(hc)); (gdb) print *op$2 = {cmd = 6, interface_version = 0, u = {getdomaininfolist = {first_domain = 0, max_domains = 1, buffer = 0xffb29eb0, num_domains = 0}, getdomaininfolists3 = {first_domain = 0, max_domains = 1, buffer = {v = 0xffb29eb0, pad = 18424963504277553152}, num_domains = 1}, getschedulerid = {sched_id = 0}, padding = {0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 255 '', 178 '', 158 '\236', 176 '', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 0 '\0' <repeats 108 times>}}} (gdb) c Continuing. -> BugI attached the current debug/workaround/testfix diff (not nice but working ;-) ) as well as my minimal debug python skript. It's late today - Have fun to continue Jerone ;) Jerone Young wrote: On Tue, 2007-06-26 at 15:29 +0200, Christian Ehrhardt wrote:Hi,here is a very rude text that shows how far I get today. It's more or less a text file where I copy&paste everything I find over the day. It is a lot of stuff, but reading it brings us completely to the same page in this issue. It should be some kind of readable if you read it top->down.BTW - I got no mail from you yesterday. This might have two reasons: a) you did not send one - well thats no problem for me ;)b) you send one and I didn't receive it - I had this issue once with Hollis where mails needed up to 5 days. Since that we always used notes & imap mail addresses for the mails to be sure (was an imap server issue)First FYI:In addition to your mail you need the following to be able to compile libvirt, virt-manager and virtinst:libssldevel ncurses-devel libtool libxml2 gnutls phyton-urlgrabber pygtk2-devel gtk2-devel now you have a complete list ;)Thanks I probably just didn't put these there. But yes all these need to be added to the grand list :-)Then if youhave your custom xen with 2.6.18 xenolinux somewhere you need to make the headers available ln -s /root/xen_2.6.18/xen-unstable.hg/linux-2.6.18-xen.hg/include/xen/public/ /usr/include/xen/linuxln -s /root/xen_2.6.18/xen-unstable.hg/xen/include/public /usr/include/xen Replace /root/xen_2.6.18/xen-unstable.hg with your xen directoryReplace xen_2.6.18/xen-unstable.hg/linux-2.6.18-xen.hg with your xenolinux directoryWell what you really need is just the xen headers. So they can be found in the xen source. Just downloading the xen-unstable and doing "make xen-install" (I think that's it). Will install all the needed headers onto the system.install virtinstThe compilation of virtinst causes the same Bug described than "virsh -r -t -d 5 connect". It occurs while trying to execute autobuild. Isolated the part "python setup.py test" of theautobuild process of virtinst. (XEN) pfn2mfn: Dom[0] pfn 0x3fefdfc000000 is not a valid page (XEN) paddr_to_maddr: Dom:0 bad paddr: 0x3fefdfc000000000I have a feeling this has little to do with virtinst and more to do with python exposing a problem in our Xen kernel.In setup.py tracked down to the call that also appers with a tracestack: The buggy thing is the "import tests.xmlconfig" that gets generated. If I skip the xmlconfig part test.coverage gets loaded without issues which implies that the load of test.xmlconfig should not be a path/directory issue.This is very interesting. I figure another python module whould set this off also. Maybe test.xmlconfig is executing something that the others are not.-> A simple python script just containing the "import tests.xmlconfig" casues the bug, use this to debug this issue-> The code printing that "bad paddr" is in "arch/powerpc/usercopy.c" => where is the connectionOk yeap, just as I figured. I'm willing to bet this starts in arch/powerpc/platforms/xen/hcall.c I'll see if I get some free time this eveing to create the 2.6.18 kernel you guys have, to narrow it down even more.pbclient4:~/libvirt/virtinst-0.103.0 # cat mytest.py import pdb pdb.set_trace(); import tests.xmlconfigTracked down with pdb debugger (every -> is the triggering function one step deeper in the stack)pbclient4:~/libvirt/virtinst-0.103.0 # gdb --args /usr/bin/python mytest.py import xmlconfig -> From Guest Guest -> import libvirt This is not virtinst, its libvirt python binding in /usr/local/python2.5/site-packages/libvirt.py -> import libvirtmod=> this is the c mapper to map python to C functions (partially generated code)-> in that code the bug is in virInitialize() which is called once initially +use GDB to debug virInitialize Ths is no more python bindings of libvirt/python its src/libvirt.c:57-> without bug through some virRegisterDriver (driver=0xf6b7488) at libvirt.c:222 for test & qemu --Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization +49 7031/16-3385 Ehrhardt@xxxxxxxxxxxxxxxxxx Ehrhardt@xxxxxxxxxx IBM Deutschland Entwicklung GmbHVorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 diff -r 57c3b9568ea6 python/libvir.c
--- a/python/libvir.c Thu Jul 19 13:52:36 2007 +0200
+++ b/python/libvir.c Fri Jul 20 16:33:12 2007 +0200
@@ -15,6 +15,8 @@
#include "libvirt_wrap.h"
#include "libvirt-py.h"
+#define DEBUG_ERROR
+
extern void initlibvirtmod(void);
PyObject *libvirt_virDomainGetUUID(PyObject *self ATTRIBUTE_UNUSED, PyObject
*args);
@@ -95,7 +97,7 @@ libvirt_virErrorFuncHandler(ATTRIBUTE_UN
PyObject *result;
#ifdef DEBUG_ERROR
- printf("libvirt_virErrorFuncHandler(%p, %s, ...) called\n", ctx, msg);
+ printf("libvirt_virErrorFuncHandler(%p) called\n", ctx);
#endif
if ((err == NULL) || (err->code == VIR_ERR_OK))
@@ -688,10 +690,22 @@ initlibvirtmod(void)
if (initialized != 0)
return;
+ // DEBUG
+ printf("Initialize Libvirtmod (sleepwait)\n");
+ sleep(5);
+
virInitialize();
+
+ // DEBUG
+ printf("post Initialize Libvirtmod / pre py_InitModule (sleepwait)\n");
+ sleep(5);
/* intialize the python extension module */
Py_InitModule((char *) "libvirtmod", libvirtMethods);
+
+ // DEBUG
+ printf("post py_InitModule (sleepwait)\n");
+ sleep(5);
initialized = 1;
}
diff -r 57c3b9568ea6 python/libvir.py
--- a/python/libvir.py Thu Jul 19 13:52:36 2007 +0200
+++ b/python/libvir.py Fri Jul 20 16:31:17 2007 +0200
@@ -4,6 +4,9 @@
# Check python/generator.py in the source distribution of libvir
# to find out more about the generation process
#
+import pdb
+pdb.set_trace()
+
import libvirtmod
import types
diff -r 57c3b9568ea6 src/libvirt.c
--- a/src/libvirt.c Thu Jul 19 13:52:36 2007 +0200
+++ b/src/libvirt.c Fri Jul 20 17:07:51 2007 +0200
@@ -72,7 +72,7 @@ virInitialize(void)
if (qemuRegister() == -1) return -1;
#endif
#ifdef WITH_XEN
- if (xenUnifiedRegister () == -1) return -1;
+ if (xenUnifiedRegister() == -1) return -1;
#endif
#ifdef WITH_REMOTE
if (remoteRegister () == -1) return -1;
diff -r 57c3b9568ea6 src/xen_internal.c
--- a/src/xen_internal.c Thu Jul 19 13:52:36 2007 +0200
+++ b/src/xen_internal.c Sun Jul 22 05:45:45 2007 +0200
@@ -36,6 +36,8 @@
#include <xen/sched.h>
#include "xml.h"
+
+#define DEBUG
/* #define DEBUG */
/*
@@ -904,24 +906,36 @@ virXen_getdomaininfolist(int handle, int
{
int ret = -1;
+
if (mlock(XEN_GETDOMAININFOLIST_DATA(dominfos),
XEN_GETDOMAININFO_SIZE * maxids) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking",
XEN_GETDOMAININFO_SIZE * maxids);
return (-1);
}
+
+ printf("%s - enter for firstdomain '%d' maxids '%d' on handle
'%d'\n",__func__,first_domain,maxids,handle);
+ printf("%s - sys_interface_version '%d' hypervisor_version
'%d'\n",__func__,sys_interface_version,hypervisor_version);
+ printf("%s - dominfos->v2 '%p' dominfos->v2d5
'%p'\n",__func__,dominfos->v2,dominfos->v2d5);
+ printf("%s - sleepwait\n",__func__);
+ sleep(5);
+
if (hypervisor_version > 1) {
xen_op_v2_sys op;
memset(&op, 0, sizeof(op));
op.cmd = XEN_V2_OP_GETDOMAININFOLIST;
+
+ printf("%s - allocated new and clean xen_op_v2_sys\n",__func__);
if (sys_interface_version < 3) {
+ printf("%s - assiigning getdomaininfolist stuff \n",__func__);
op.u.getdomaininfolist.first_domain = (domid_t) first_domain;
op.u.getdomaininfolist.max_domains = maxids;
op.u.getdomaininfolist.buffer = dominfos->v2;
op.u.getdomaininfolist.num_domains = maxids;
} else {
+ printf("%s - assiigning getdomaininfolist3 stuff \n",__func__);
op.u.getdomaininfolists3.first_domain = (domid_t) first_domain;
op.u.getdomaininfolists3.max_domains = maxids;
op.u.getdomaininfolists3.buffer.v = dominfos->v2d5;
@@ -973,11 +987,20 @@ virXen_getdomaininfo(int handle, int fir
virXen_getdomaininfo(int handle, int first_domain,
xen_getdomaininfo *dominfo) {
xen_getdomaininfolist dominfos;
+ memset(&dominfos, 0, sizeof(dominfos));
if (hypervisor_version < 2) {
dominfos.v0 = &(dominfo->v0);
+ printf("%s - assigning &(dominfo->v0) '%p'\n",__func__,&(dominfo->v0));
} else {
- dominfos.v2 = &(dominfo->v2);
+ if (sys_interface_version < 3) {
+ dominfos.v2 = &(dominfo->v2);
+ printf("%s - assigning &(dominfo->v2)
'%p'\n",__func__,&(dominfo->v2));
+ }
+ else {
+ dominfos.v2d5 = &(dominfo->v2d5);
+ printf("%s - assigning &(dominfo->v2d5)
'%p'\n",__func__,&(dominfo->v2d5));
+ }
}
return virXen_getdomaininfolist(handle, first_domain, 1, &dominfos);
@@ -1762,7 +1785,7 @@ xenHypervisorInit(void)
if ((ret != -1) && (ret != 0)) {
#ifdef DEBUG
- fprintf(stderr, "Using new hypervisor call: %X\n", ret);
+ printf(stderr, "Using new hypervisor call: %X\n", ret);
#endif
hv_version = ret;
xen_ioctl_hypercall_cmd = cmd;
@@ -1779,7 +1802,7 @@ xenHypervisorInit(void)
ret = ioctl(fd, cmd, (unsigned long) &v0_hc);
if ((ret != -1) && (ret != 0)) {
#ifdef DEBUG
- fprintf(stderr, "Using old hypervisor call: %X\n", ret);
+ printf(stderr, "Using old hypervisor call: %X\n", ret);
#endif
hv_version = ret;
xen_ioctl_hypercall_cmd = cmd;
@@ -1808,7 +1831,7 @@ xenHypervisorInit(void)
ipt = malloc(sizeof(virVcpuInfo));
if (ipt == NULL){
#ifdef DEBUG
- fprintf(stderr, "Memory allocation failed at xenHypervisorInit()\n");
+ printf(stderr, "Memory allocation failed at xenHypervisorInit()\n");
#endif
return(-1);
}
Attachment:
mytest.py _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ppc-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |