[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers

To: Olaf Hering <olaf@xxxxxxxxx>
From: Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx>
Date: Tue, 3 Jan 2012 14:19:53 -0400
Cc: andres@xxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, Adin Scannell <adin@xxxxxxxxxxx>, JBeulich@xxxxxxxx, adin@xxxxxxxxxxxxxxx
Delivery-date: Tue, 03 Jan 2012 18:20:25 +0000
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Mon, Jan 02, 2012 at 05:06:12PM +0100, Olaf Hering wrote:
> On Sat, Dec 17, Konrad Rzeszutek Wilk wrote:
> 
> > > +#define gnttab_check_GNTST_eagain_while(__HCop, __HCarg_p)               
> > >                         \
> > 
> > So why does this have to be a macro? What is the advantage of that
> > versus making this a function?
> 
> I dont remember why I turned this into a macro instead of a function.
> 
> > > +do {                                                                     
> > >                                                                         \
> > > + u8 __hc_delay = 1;                                                      
> > >                                                         \
> > > + int __ret;                                                              
> > >                                                                 \
> > > + while (unlikely((__HCarg_p)->status == GNTST_eagain && __hc_delay)) {   
> > > \
> > > +         msleep(__hc_delay++);                                           
> > >                                                 \
> > 
> > Ugh. Not sure what happend to this, but there are tons of '\' at the
> > end.
> 
> A multiline macro needs backslashes at the end.

Yes. I should have been more specific. The '\' are out of aligment.
> 
> > So why msleep? Why not go for a proper yield? Call the safe_halt()
> > function?
> 
> It needs some interuptible sleep, whatever is best in this context.
> 
> > > +         __ret = HYPERVISOR_grant_table_op(__HCop, (__HCarg_p), 1);      
> > >                 \
> > > +         BUG_ON(__ret);                                                  
> > >                                                         \
> > > + }                                                                       
> > >                                                                         \
> > > + if (__hc_delay == 0) {                                                  
> > >                                                 \
> > 
> > So this would happen if we rolled over __hc_delay, so did this more than
> > 255 times? Presumarily this can happen if the swapper in dom0 crashes..
> 
> Or if something in the paging paths goes wrong.
> 
> > I would recommend you use 'WARN' here and include tons of details.
> > This is a pretty serious issues, is it not?
> 
> Either the host is really busy and cant page-in quick enough even after
> so-many seconds. Or something in the pager/hypervisor does not work
> right. In either case, a backtrace wont help much as it does only cause
> noise. The printk below prints the function name (I think thats the
> reason why it is a macro) to give some hint. 

OK, we can do this differently. Make a function that does the majority
of this, and one of the arguments is a 'const char *name' and use a
macro that does:

#define gnttab_check_GNTST_eagain_while(__HCop, __HCarg_p)
real_function(__func__, __Hcop, __HCarg_p,...)

or such.

If this problem does occur (the swapper died in dom0) should the
printk at least use printk_ratelimited so that we don't cause too much
noise?
> 
> > > +         printk(KERN_ERR "%s: gnt busy\n", __func__,);                   
> > >                         \
> > > +         (__HCarg_p)->status = GNTST_bad_page;                           
> > >                                 \
> > > + }                                                                       
> > >                                                                         \
> > > + if ((__HCarg_p)->status != GNTST_okay)                                  
> > >                                 \
> > > +         printk(KERN_ERR "%s: gnt status %x\n",                          
> > >                                 \
> > > +                 __func__, (__HCarg_p)->status);                         
> > >                                         \
> > 
> > Use GNTTABOP_error_msgs. Also should we continue? What is the
> > impact if we do continue? The times this is hit is if the status
> > is not GNTS_okay and if it is not GNTS_eagain - so what are the
> > situations in which this happens and what can the domain do
> > to recover? Should there be some helpfull message instead of
> > just "gnt status: X" ?
> 
> The caller has to deal with the various !GNTST_okay states anyway, this
> patch wont change that fact.

Ok, so then we don't really need the printk right? As the caller
would presumarily do the right thing and also print the error?

> 
> Olaf
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers
  - From: Olaf Hering

References:
- Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers
  - From: Olaf Hering

Prev by Date: Re: [Xen-devel] [PATCH v2] libxl: add support for yajl 2.x
Next by Date: Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers
Previous by thread: Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers
Next by thread: Re: [Xen-devel] [PATCH 2/3] Handle GNTST_eagain in kernel drivers
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.