Xen project Mailing List

Re: [Xen-devel] [PATCHv10 0/9] Xen: extend kexec hypercall for use with pv-ops kernels

To: Daniel Kiper <daniel.kiper@xxxxxxxxxx>

From: David Vrabel <david.vrabel@xxxxxxxxxx>

Date: Fri, 8 Nov 2013 13:13:59 +0000

Cc: Keir Fraser <keir@xxxxxxx>, kexec@xxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Fri, 08 Nov 2013 13:14:22 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Keir, Sorry, forgot to CC you on this series. Can we have your opinion on whether this kexec series can be merged? And if not, what further work and/or testing is required? On 07/11/13 21:16, Daniel Kiper wrote: > On Wed, Nov 06, 2013 at 02:49:37PM +0000, David Vrabel wrote: >> The series (for Xen 4.4) improves the kexec hypercall by making Xen >> responsible for loading and relocating the image. This allows kexec >> to be usable by pv-ops kernels and should allow kexec to be usable >> from a HVM or PVH privileged domain. >> >> I have now tested this with a Linux kernel image using the VGA console >> which was what was causing problems in v9 (this turned out to be a >> kexec-tools bug). >> >> The required patch series for kexec-tools will be posted shortly and >> are available from the xen-v7 branch of: > > In general it works. However, quite often I am not able to execute panic > kernel. Machine hangs with following message: I cannot reproduce any failures, neither on my dev box nor on any of the automated XenServer tests that run on a range of different hardware platforms. I find kexec to be very reliable and an earlier version of this series has been in production within XenServer for a while now and has seen real use in the field. None of the issues reported so far have been regressions but failures in specific uses of the new support for pv-ops kernels. I really can't see how I can do anything else to make this series acceptable for merging. In my opinion, the current implementation is so broken[1] and useless[2] that anything that even vaguely looks like it might work is significant improvement, and something that is deployed usefully in production should definitely be merged. [1] Uses code provided by the guest to jump out of Xen into the image which works only through luck. Does not (and has never) worked reliably with 32-bit dom0. [2] Does not work at all (and will never work) with upstream kernels. > (XEN) Domain 0 crashed: Executing crash image > > gdb shows: > > (gdb) bt > #0 0xffff82d0801a0092 in do_nmi_crash (regs=<optimized out>) at crash.c:113 > #1 0xffff82d0802281d9 in nmi_crash () at entry.S:666 > #2 0x0000000000000000 in ?? () > (gdb) > > Especially second bt line scares me... ;-))) > > I have not been able to identify why NMI was activated because > stack is completely cleared. All this you have described here is correct and expected behavior, which, quite frankly, you should have been able to see with even the most cursory look at the code. > Additionally, my compiler fails because it detects unused result > variable in xen/common/kimage.c:kimage_crash_alloc(). Yes, sorry about that. That was fallout from a last minute trivial cleanup. I've posted an updated patch correcting this. David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.