[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Test results for save/restore with upstream pv_ops domU kernels, 2.6.32.7 works OK



On Mon, Feb 01, 2010 at 04:26:05PM +0100, Andrew Jones wrote:
> On 02/01/2010 04:10 PM, Pasi Kärkkäinen wrote:
> > On Fri, Jan 29, 2010 at 12:53:38PM +0200, Pasi Kärkkäinen wrote:
> >> On Fri, Jan 29, 2010 at 10:35:32AM +0000, Ian Campbell wrote:
> >>> On Thu, 2010-01-28 at 21:25 +0000, Pasi Kärkkäinen wrote:
> >>>> Hello,
> >>>>
> >>>> I just tried some save/restore tests with Fedora 12 Linux 2.6.31.12 
> >>>> kernels.
> >>>> The exact Fedora kernel versions are: 2.6.31.12-174.2.3.fc12.i686.PAE 
> >>>> and 2.6.31.12-174.2.3.fc12.x86_64.
> >>>>
> >>>> Dom0 for these tests was CentOS 5.4 (Xen 3.1.2).
> >>>>
> >>>> - F12 32bit 1vcpu PV guest: 
> >>>>  save+restore OK, BUG() in guest dmesg after restore [1]
> >>>>
> >>>> - F12 64bit 1vcpu PV guest:
> >>>>  save+restore OK, BUG() in guest dmesg after restore [2]
> >>>
> >>> I think those are the same underlying bug and are fixed by 
> >>> http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=777df2b017ef34b2d1a172bf769582158839a860
> >>>
> >>
> >> Ok. 
> >>
> >> There was also this traceback in the beginning of boot, on all 
> >> 32bit/64bit, 1vcpu2/vcpu guest combinations:
> >>
> >> Performance Counters: Core2 events, Intel PMU driver.
> >> ------------[ cut here ]------------
> >> WARNING: at arch/x86/kernel/apic/apic.c:247 
> >> native_apic_write_dummy+0x32/0x3e() (Not tainted)
> >> Modules linked in:
> >> Pid: 0, comm: swapper Not tainted 2.6.31.12-174.2.3.fc12.i686.PAE #1
> >> Call Trace:
> >>  [<c043db4b>] warn_slowpath_common+0x70/0x87
> >>  [<c041cfb2>] ? native_apic_write_dummy+0x32/0x3e
> >>  [<c043db74>] warn_slowpath_null+0x12/0x15
> >>  [<c041cfb2>] native_apic_write_dummy+0x32/0x3e
> >>  [<c0411e04>] perf_counters_lapic_init+0x30/0x32
> >>  [<c09b3b1b>] init_hw_perf_counters+0x2bc/0x355
> >>  [<c09b3628>] identify_boot_cpu+0x21/0x23
> >>  [<c09b378e>] check_bugs+0xb/0xdc
> >>  [<c047fd73>] ? delayacct_init+0x47/0x4c
> >>  [<c09ab8b4>] start_kernel+0x31c/0x330
> >>  [<c09ab081>] i386_start_kernel+0x70/0x77
> >>  [<c09ae2bb>] xen_start_kernel+0x4b9/0x4c1
> >>  [<c04090a1>] ? syscall_exit+0x1/0x16
> >> ---[ end trace a7919e7f17c0a725 ]---
> >>
> >> Full boot logs here:
> >> http://pasik.reaktio.net/xen/debug/fedora/
> >>
> > 
> > 
> > This boot-time traceback disappeared when I updated the guest to 2.6.32.7.
> > 
> > 
> >>
> >>>>
> >>>> - F12 32bit 2vcpu PV guest:
> >>>>  save doesn't work, guest stays as "migrating-f12test32" in "xm list" 
> >>>> forever and has to be "xm destroy"ed.
> >>>>
> >>>> - F12 64bit 2vcpu PV guest:
> >>>>  save doesn't work, guest stays as "migrating-f12test64" in "xm list" 
> >>>> forever and has to be "xm destroy"ed.
> >>>>
> >>>>
> >>>> What's the best way to debug failing "xm save" ? There was no errors in 
> >>>> "xm log", or in "xm dmesg".
> >>>
> >>> I think you might see some stuff in /var/log/xen/something but I don't
> >>> have any particular tips apart from "add printf/printk".
> >>>
> >>
> >> I'll check /var/log/xen/.
> >>
> >>>> Also the guest "xm console" doesn't show anything before it dies.
> >>>>
> >>>> Is it possible some of the save/restore related patches didn't make it 
> >>>> to 2.6.31.x stable kernels? 
> >>>
> >>> AFAIK they only went into the 2.6.32 stable branch. Unfortunately I
> >>> think the 2.6.31 stable series has come to an end now.
> >>>
> >>
> >> Ok. I'll test 2.6.32.latest aswell.
> >>
> > 
> > I grabbed upstream kernel.org Linux 2.6.32.7, and tested the following 
> > combinations:
> > 
> > - F12 32bit 1vcpu PV guest
> > - F12 32bit 2vcpu PV guest
> > - F12 64bit 1vcpu PV guest
> > - F12 64bit 2vcpu PV guest
> > 
> > save+restore was successfull for all of the above guests running 2.6.32.7. 
> > No BUGs or tracebacks anymore.
> > 
> > Any tips for git magic to get all the recent save/restore fixes that went 
> > to 2.6.32.x,
> > so I could send them to Fedora people to apply to F12 kernel? 
> > 
> 
> This is great news, but it might a good idea to try 2 or more
> save-restore rounds in row first, if you haven't already. In the past
> I've seen 1 save/restore work, but then the 2nd round fail. Although,
> usually there's some symptom of badness on the 1st round as well.
> 

I forgot to mention that I tried twice with all of the above guests :)
It seems stable.

I just did one more test.. save+restore 5 times in a row, with 4 vcpu PV guest. 
No problems found.

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.