[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot
> -----Original Message----- > From: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx] > Sent: 06 June 2017 18:00 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; 'Jan Beulich' > <JBeulich@xxxxxxxx> > Cc: xen-devel (xen-devel@xxxxxxxxxxxxxxxxxxxx) <xen- > devel@xxxxxxxxxxxxxxxxxxxx> > Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot > > On 06/06/2017 12:28 PM, Paul Durrant wrote: > >> -----Original Message----- > >> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of > >> Paul Durrant > >> Sent: 06 June 2017 16:52 > >> To: 'Jan Beulich' <JBeulich@xxxxxxxx> > >> Cc: xen-devel (xen-devel@xxxxxxxxxxxxxxxxxxxx) <xen- > >> devel@xxxxxxxxxxxxxxxxxxxx> > >> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot > >> > >>> -----Original Message----- > >>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > >>> Sent: 06 June 2017 16:11 > >>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > >>> Cc: xen-devel (xen-devel@xxxxxxxxxxxxxxxxxxxx) <xen- > >>> devel@xxxxxxxxxxxxxxxxxxxx> > >>> Subject: Re: [Xen-devel] debian stretch dom0 + xen 4.9 fails to boot > >>> > >>>>>> On 06.06.17 at 16:32, <Paul.Durrant@xxxxxxxxxx> wrote: > >>>> I've been having fun setting up a new test rig... > >>>> > >>>> I have a skull canyon NUC and I put debian stretch (rc4) on it (so > >>>> that's a > >>>> 4.9 kernel) and then tried building and installing the latest Xen > >>>> staging- > 4.9 > >>>> code. The system failed to boot... basically it got stuck before even > >>>> managing to get sufficiently into Xen to spit out anything on the > console. > >>>> Xen 4.8 OTOH booted just fine so I started bisecting and after 14 > >> iterations > >>>> I got down to the following commit is being the problem: > >>>> > >>>> commit c0655e492e6b33e26ec9cd33f59725d0db89cdd0 > >>>> Author: Juergen Gross <jgross@xxxxxxxx> > >>>> Date: Fri Mar 24 14:18:54 2017 +0100 > >>>> > >>>> x86: split boot trampoline into permanent and temporary part > >>>> > >>>> The hypervisor needs a trampoline in low memory for early boot and > >>>> later for bringing up cpus and during wakeup from suspend. Today > this > >>>> trampoline is kept completely even if most of it isn't needed later. > >>>> > >>>> Split the trampoline into a permanent part and a temporary part > >> needed > >>>> at early boot only. Introduce a new entry at the boundary. > >>>> > >>>> Reduce the stack for wakeup code in order for the permanent > >>>> trampoline to fit in a single page. 4k of stack seems excessive, > >>>> about > >>>> 3k should be more than enough. > >>>> > >>>> Add an ASSERT() to the linker script to ensure the wakeup stack is > >>>> always at least 3k. > >>>> > >>>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx> > >>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > >>>> > >>>> To verify this I checked out master, reverted that commit, and tried > again. > >>>> The NUC still booted fine. > >>> Well, interesting, but I don't think it is very realistic to expect any > >>> fix with just the information you supply. There must be something > >>> rather special about that system, and likely it would help if we > >>> knew what that is. E.g. an unusual E820 map. Worse would be if > >>> they used memory outside of properly marked E820 regions in a > >>> way colliding with what we do. > >>> > >>> Otherwise I'm afraid we need to hope for you to debug the issue. > >>> > >> Yes, I was posting this more a heads-up for the moment, so that 4.9 does > not > >> go out with this regression. > >> > >> I will try to figure out what is going on... My initial thoughts on > >> looking at > what > >> the patch does are that it may be something to do with the fact I am using > a > >> vga console rather than a serial one. I need to try another 4.9 on another > >> system (gigabyte brix) to see if the problem manifests there too. I'll also > have > >> to play with the BIOS settings on the skull canyon. > >> > > The problem definitely doesn't manifest on the brix, so the next theory is > that it is something to do with the BIOS of the skull canyon. > > > > > FWIW, one of machines in our test farm choked on this very patch. I > don't remember details now but essentially it turned out that syslinux > (we are pxe-booting) could not handle changes in ELF sections layout > (the way syslinux calculated how to load the binary into memory resulted > in overlap of some sort). > > I hacked it (mboot.c32 specifically) to work around this but never came > up with a proper solution. > In my case it was grub2... and thinking about it I am running an older version on the brix so I guess it may still manifest there if I update. Either way it sounds like it may be better to revert the patch until the issue is better understood. Paul > -boris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |