[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2
Am 19.11.15 um 11:38 schrieb Andrew Cooper: On 19/11/15 10:24, Jan Beulich wrote:On 19.11.15 at 00:17, <andrew.cooper3@xxxxxxxxxx> wrote:The disassembly of do_IRQ now looks like a plausible function, but the consistently faulting address has no plausible way of generating a double fault. I suspect therefore that something has caused memory corruption in Xen .text section.Dump of assembler code for function do_IRQ: 0xffff82d080176577 <+0>: push %rbp 0xffff82d080176578 <+1>: mov %rsp,%rbp 0xffff82d08017657b <+4>: push %r15 0xffff82d08017657d <+6>: push %r14 0xffff82d08017657f <+8>: push %r13 0xffff82d080176581 <+10>: push %r12 0xffff82d080176583 <+12>: push %rbx 0xffff82d080176584 <+13>: lea -0x1058(%rsp),%rsp 0xffff82d08017658c <+21>: orq $0x0,(%rsp) 0xffff82d080176591 <+26>: lea 0x1020(%rsp),%rsp The orq surely has potential for causing a double fault, if %rsp is near the stack limit. The two LEAs look suspect, presumably a result of some non-standard option passed to gcc. Removing that option might already be a step forward. Andrew, Jan - thanks again. In terms of non-standard options passed to gcc I have tried to make sense of what flags are actually being used during the build process. I am not absolutely sure, but I think the options passed to gcc are as follows: I do have system wide flags which are used for non-debug builds: CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer" CXXFLAGS="${CFLAGS}" LDFLAGS="-Wl,-O1 -Wl,--as-needed" for builds with debug symbols (using splitdebug) there are system wide overrides as follows: CFLAGS="-march=native -O2 -pipe -ggdb" CXXFLAGS="${CFLAGS}" LDFLAGS: I'd assume that this inherits its value from the system wide setting of LDFLAGS for xen (the hypervisor) the build system seems to do the following: CFLAGS="" (i.e. unset CFLAGS) to me this indicates that the rest stays untouched (i.e. either standard or debug flags) for xen-tools (includes e.g. hvmloader) the build system appears to to the following: CFLAGS="" (i.e. unset CFLAGS) CXXFLAGS="${CXXFLAGS} -fno-strict-overflow" LDFLAGS="" (i.e. unset LDFLAGS) So I think there's probably nothing really fancy in my options to gcc. Actually yes - that is a huge quantity of stack usage. (The actual behaviour looks very suspect - it appears to be completely pointless). The #DF handler reports that %rsp in the exception frame is within range. Having said that, (XEN) [ 2.788209] rbp: ffff83080ca8ed78 rsp: ffff83080ca8dcf8 r8: ffff83080ca9d558 ... (XEN) [ 2.837474] Valid stack range: ffff83080ca8e000-ffff83080ca90000, sp=ffff83080ca8dcf8, tss.esp0=ffff83080ca8ffc0 (XEN) [ 2.848969] No stack overflow detected. Skipping stack trace. In this case, the stack pointer *is* out of range, and has hit the guard page. This means: 1) There is some bug in the stack overflow detection in the #DF handler. 2) Whatever options Gentoo compiles Xen with is sufficient to overflow the 8K hypervisor stack. Thanks Atom2 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |