[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-ia64-devel] Console problem on domU on tip?


  • To: "Xu, Anthony" <anthony.xu@xxxxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Magenheimer, Dan (HP Labs Fort Collins)" <dan.magenheimer@xxxxxx>
  • Date: Mon, 19 Dec 2005 07:50:49 -0800
  • Delivery-date: Mon, 19 Dec 2005 15:54:08 +0000
  • List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
  • Thread-index: AcYBELBGMu7ZHSeYRUaEav49mCDvgwAAVFMwAADLrDAAIbQEoAARHDxgAAFwgsAApRrdUAAOEX2AAAAuljA=
  • Thread-topic: [Xen-ia64-devel] Console problem on domU on tip?

 Sorry, fat fingers...

In ia64_pal_call_static:

        1:      mov psr.l = loc3
                mov ar.rsc = loc4

Is there a race condition possible here if the first
instruction turns on psr.ic (but not serialized yet)
and the second causes a mandatory RSE fault?

> -----Original Message-----
> From: Magenheimer, Dan (HP Labs Fort Collins) 
> Sent: Monday, December 19, 2005 8:48 AM
> To: 'Xu, Anthony'; Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
> 
> I have been distracted tracking another bug...
> 
> Here's where I got:
> 
> The machine is a new (April 2005) HP rx2620 so it is
> not old firmware.   I can't reproduce it on a machine
> with an ITP (which does have older firmware).
> 
> This PAL call is never used in Linux, though there is a
> routine coded for it.  It is the only
> PAL call coded in Linux that occurs with psr.ic off.
> 
> The crash I am seeing occurs either during the PAL call or
> immediately upon return.
> 
> Is it OK to 
> 
> 
> > -----Original Message-----
> > From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] 
> > Sent: Monday, December 19, 2005 2:02 AM
> > To: Tian, Kevin; Magenheimer, Dan (HP Labs Fort Collins); 
> > xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
> > 
> > Dan,
> > Have you got time to verify below discussion?
> > 
> > Thanks
> > -Anthony
> > 
> > >-----Original Message-----
> > >From: Tian, Kevin
> > >Sent: 2005å12æ16æ 10:16
> > >To: Xu, Anthony; 'Magenheimer, Dan (HP Labs Fort Collins)';
> > >'xen-ia64-devel@xxxxxxxxxxxxxxxxxxx'
> > >Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
> > >
> > >>From: Xu, Anthony
> > >>Sent: 2005å12æ16æ 9:54
> > >>
> > >>>Also, why panic if it fails?
> > >>>
> > >
> > >Panic is not required here, and we could just print out a 
> > warning message.
> > >Previously panic is kept there to help our debug in early stage.
> > >
> > >>
> > >>
> > >>>Does the problem happen only on VTI?  Or both VTI and non-VTI on
> > >>>split-cache machines?
> > >>
> > >>Sometimes, it makes domain0 crash at the very beginning of 
> > the domain0 boot
> > >>process, especially on MP machine.
> > >>
> > >>
> > >>Thanks
> > >>-Anthony
> > >
> > >One complement is, that problem definitely exists on new 
> > split-cache processors,
> > >for dom0/domU. For VTI domain, we have logic within device 
> > model to ensure
> > >consistence.
> > >
> > >Thanks,
> > >Kevin
> > >>
> > >>
> > >>>-----Original Message-----
> > >>>From: Magenheimer, Dan (HP Labs Fort Collins)
> > >>[mailto:dan.magenheimer@xxxxxx]
> > >>>Sent: 2005å12æ16æ 1:39
> > >>>To: Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> > >>>Cc: Xu, Anthony
> > >>>Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
> > >>>
> > >>>> >Is this code fragment necessary for VTI to boot domU
> > >>>> >or is it OK to remove?
> > >>>>
> > >>>>        The comment is inaccurate and it should be 
> domU. That I/D cache
> > >>>> sync step is mandatory to boot domU on new IA64 
> > processor which has
> > >>>> split L2 I/D cache. If without such I/D cache sync, control
> > >>>> panel loads
> > >>>> domU's kernel image which only affects D side cache. If 
> > there're some
> > >>>> stale entry on I-side cache within same range of dom0 image,
> > >>>> people will
> > >>>> see machine going weird.
> > >>>
> > >>>I don't understand... how can there be stale entries in 
> > the I-cache?
> > >>>The instructions have just been written to memory 
> (through D-cache)
> > >>>and no instructions in this domain have yet been executed.
> > >>>I do see that the D-cache needs to be flushed so that memory is
> > >>>coherent but are there better ways to do that without a pal call?
> > >>>
> > >>>>        Normally I/D cache sync shouldn't force any 
> problem. Possibly
> > >>>> there's some problem with the pal calling code, like 
> > incorrect ITLB
> > >>>> mapping for pal or similar issue...
> > >>>
> > >>>Although the ia64_pal_cache_flush routine is defined in 
> > linux's pal.h,
> > >>>it doesn't appear to be used anywhere in Linux so there is no use
> > >>>model to copy.  I suspect there is some use model for 
> the call that
> > >>>we don't understand, for example maybe it should only be 
> > called with
> > >>>physical &progress?  It definitely fails every time on one of
> > >>>my (newer) machines and disabling the pal call makes the problem
> > >>>go away.
> > >>>
> > >>>> Though it's intermittent, please
> > >>>> keep this code
> > >>>> there for correctness.
> > >>>
> > >>>Since the call is definitely failing under some circumstances
> > >>>that we don't understand, I'm inclined to at least put the code
> > >>>in an #ifdef CONFIG_SPLIT_CACHE
> > >>>
> > >>>Does the problem happen only on VTI?  Or both VTI and non-VTI on
> > >>>split-cache machines?
> > >>>
> > >>>Thanks,
> > >>>Dan
> > >>>
> > >>>P.S. I tried Anthony's patch (which moves the PAL call after
> > >>>new_thread()) but it still crashes.
> > 
> 
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.