[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RE: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate problem)
OK, thanks for looking for the problem. Since you can't reproduce it, it is likely a problem specific to the Dell D630 motherboard or BIOS or HPET or something like that. Since I have a workaround (max_cstate=2), I will just continue to use the workaround. I don't have a dock but if I get the serial port working at some point, I will try to reproduce the problem again. Also, when Ke's work on RTC emulation is completed, let me know and I will give that a try. Thanks, Dan > -----Original Message----- > From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx] > Sent: Sunday, December 13, 2009 1:26 AM > To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail) > Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely > hpet/Cstate problem) > > > Hi, Dan > We still can't reproduce this failure locally, even with > Merom laptop. Do you have the dock with your Dell 630, and I > think the dock should have the serial port support, and maybe > you can get the failure log through it. If we can get the > failure log, it should be helpful to identify this issue. > Also I have analyzed the Cset #20072 and Cst20073, and have > no any clue which can lead to this issue. In addition, I > also talked with Ke, he said he could reproduce another issue > related to hwclock, but for this issue, he also can't catch > it in any platforms. :( > Thanks! > Xiantao > > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > Sent: Wednesday, December 09, 2009 11:50 PM > To: Zhang, Xiantao; Yu, Ke; Xen-Devel (E-mail) > Subject: RE: latest xen-unstable fails to boot on Dell D630 > (likely hpet/Cstate problem) > > > Could you attach the failure log ? > > I can't get any failure logs because dom0 fails to boot. > The failure conditions are the same as described > here: > > http://lists.xensource.com/archives/html/xen-devel/2009-10/msg > 01027.html > > However, I have attached the xm dmesg output from > a successful boot (with max_cstate=2). > > > In addition, does this system have ioapic support ? > > I think so. See attached log. > > > I think hpet doesn't use MSI, right ? > > I don't think so. > > Dan > > > -----Original Message----- > > From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx] > > Sent: Tuesday, December 08, 2009 9:40 PM > > To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail) > > Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely > > hpet/Cstate problem) > > > > > > Dan Magenheimer wrote: > > > FYI, 20073+20093+20149 boots properly and xend starts > > > WITH max_cstate=2, but dom0 FAILs to boot unless > > > max_cstate=2 is added as a Xen boot parameter. > > > > Could you attach the failure log ? In addition, does this > > system have ioapic support ? I think hpet doesn't use MSI, right ? > > Xiantao > > > > > > > So I still think something changed at 20073 that > > > causes Merom+RHEL5dom0 to fail to boot due to not > > > recovering from deep C-state (after dom0 runs > > > /sbin/hwclock ... Ke Yu knows how to reproduce > > > the problem). > > > > > > Thanks, > > > Dan > > > > > >> -----Original Message----- > > >> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx] > > >> Sent: Tuesday, December 08, 2009 6:44 PM > > >> To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail) > > >> Subject: RE: latest xen-unstable fails to boot on Dell > D630 (likely > > >> hpet/Cstate problem) > > >> > > >> > > >> Dan, > > >> Don't use Cset20073 for testing separately, since it needs > > >> two minor fixes check-ined by the Cset #20093 and #20149. > > >> Except this, Keir also has a typo in Cset #20076 fixed by > > >> Cset #20092. In addition, one serious issue is also > > >> introduced in #Cset20084 which is fixed in Cset #20140. I > > >> remembered Pod also has issues which can crash hypervisor > > >> before Cset #20100. Thus, it is too hard to identify this > > >> issue through bisect before #Cset20149, since these issues > > >> are introduced and fixed crossedly. Certainly, if you want > > >> to test Cset #20073, you at least have to apply the > > >> Cset#20093 and #20149 on top of it. :) > > >> Xiantao > > >> > > >> > > >> Dan Magenheimer wrote: > > >>>> But I'll give bisecting a try. > > >>> > > >>> Looks like the problem has been around for awhile. It appears > > >>> the problem starts at c/s 20073. Xiantao cc'ed since > > 20073 was his > > >>> patch. > > >>> > > >>> 20070 boots OK without max_cstate=2 > > >>> > > >>> 20072 boots most of the way without max_cstate=2 but crashes > > >>> before a login prompt (when xend is starting I think) > > >>> > > >>> 20073 FAILS to boot without max_cstate=2 but crashes > > before a > > >>> login prompt > > >>> > > >>> 20082 FAILS to boot without max_cstate=2 but crashes > > >>> before a login prompt with max_cstate=2 > > >>> > > >>> 20143 FAILS to boot without max_cstate=2 but boots OK with > > >>> max_cstate=2 > > >>> > > >>> Note that I have NOT bisected tools, just the hypervisor > > >>> so the crashes are likely due to a newer xend failing on > > >>> an older hypervisor (which is irrelevant to this problem). > > >>> > > >>>> -----Original Message----- > > >>>> From: Dan Magenheimer > > >>>> Sent: Tuesday, December 08, 2009 10:42 AM > > >>>> To: Yu, Ke; Xen-Devel (E-mail) > > >>>> Subject: RE: latest xen-unstable fails to boot on Dell D630 > > >>>> (likely hpet/Cstate problem) > > >>>> > > >>>> > > >>>>> case, if convenient, could you help to do some bisect to see > > >>>>> which cset cause this bug? > > >>>> > > >>>> I can do this, but because it is often no longer easy to > > >>>> bisect Xen because of interdependencies with other > > >>>> components, I was hoping that Keir or you or someone might > > >>>> have some idea of what changeset might have caused the > > regression. > > >>>> But I'll give bisecting a try. > > >>>> > > >>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can > > >>>>> Xen response to three Ctrl-'A' in serial? > > >>>> > > >>>> Unfortunately, I can't seem to get a Xen console working on > > >>>> the Merom machine, and the problem can't be reproduced on > > >>>> my other machine where the Xen console is working (because > > >>>> Conroe doesn't support deep C). > > >>>> > > >>>>> -----Original Message----- > > >>>>> From: Yu, Ke [mailto:ke.yu@xxxxxxxxx] > > >>>>> Sent: Tuesday, December 08, 2009 12:08 AM > > >>>>> To: Dan Magenheimer; Xen-Devel (E-mail) > > >>>>> Subject: RE: latest xen-unstable fails to boot on Dell D630 > > >>>>> (likely hpet/Cstate problem) > > >>>>> > > >>>>> > > >>>>>> -----Original Message----- > > >>>>>> In this thread, I observed that I was unable to > > >>>>>> provoke deep C state (C3) on my Dell D630, which has > > >>>>>> a Intel Merom (dual-core laptop) processor. At that > > >>>>>> time, when I tried enabling hpetbroadcast, dom0 boot failed. > > >>>>>> > > >>>>>> http://lists.xensource.com/archives/html/xen-devel/2009-10/ms > > >>>>>> g01027.html > > >>>>>> > > >>>>>> As it turned out, all RHEL5-based (maybe RHEL4- also) dom0 > > >>>>>> default installation run /sbin/hwclock, which IIRC takes > > >>>>>> the RTC away from Xen and gives it to dom0. Since the > > >>>>>> Xen hpet emulation does not do RTC emulation, bad things > > >>>>>> then happen when a deep Cstate is entered (dom0 apparently > > >>>>>> never wakes up). I think Ke Yu has also reproduced > > this problem. > > >>>>>> > > >>>>>> Sometime in the last few weeks, some patch in xen-unstable > > >>>>>> apparently changed some defaults and xen-unstable will > > >>>>>> no longer boot with this processor/dom0, with or without > > >>>>>> hpetbroadcast on the Xen command line. However, specifying > > >>>>>> max_cstate=2 on the Xen command line allows a successful > > >>>>>> dom0 boot, so I suspect the problem is the same (or at > > >>>>>> least very similar). > > >>>>>> > > >>>>>> I did a quick scan for hpet changes and found c/s 20497, > > >>>>>> but backing it out made no difference. > > >>>>>> > > >>>>>> I have a workaround for now, but since it is likely that > > >>>>>> many customers (including all of Oracle's OVS customers) > > >>>>>> use a RHEL5-based dom0 boot sequence, and Merom processors > > >>>>>> work fine otherwise, it would be nice to get this identified > > >>>>>> and fixed before 4.0. > > >>>>> > > >>>>> Let's firstly figure out which component the issue resides. > > >>>>> > > >>>>> Firstly, in the default boot (i.e. without specifying > > >>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can > > >>>>> Xen response to three Ctrl-'A' in serial? > > >>>>> > > >>>>> If only dom0 hangs, it is probably that RTC malfunction make > > >>>>> incorrect dom0 time and lead dom0 fail to boot. Then RTC > > >>>>> emulation in hypervisor should fix this issue. > > >>>>> > > >>>>> If Xen also hangs, it should be another bug, i.e. hpet > > >>>>> broadcast does not wake up CPU in deep C states. in this > > >>>>> case, if convenient, could you help to do some bisect to see > > >>>>> which cset cause this bug? > > >>>>> > > >>>>> Best Regards > > >>>>> Ke > > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |