[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] ARM: cache coherence problem in guestcopy.c



On Thu, 2013-06-20 at 08:34 +0000, Jaeyong Yoo wrote:
> > On Tue, 2013-06-18 at 13:18 +0100, Ian Campbell wrote:
> > > On Tue, 2013-06-18 at 12:05 +0000, Jaeyong Yoo wrote:
> > > > > We were also speculating that we probably want some DMBs in
> > > > > context_switch_{from,to} as well as at return_to_guest.
> > > > 
> > > > Actually, I just learned ReOrder Buffer, and it looks like so.
> > 
> > Does this patch help with the issue you are seeing?
> 
> I tried the combinations and it does not work.

Thanks, this is useful to know. It may be that our analysis is simply
flawed and something else it at work in both cases.

>  I think my problem maybe stem 
> from a different reason. Since this problem happens while
> we try to migrate domU, something really weird may happen.
> 
> Actually, one of my colleage told me that this problem I'm having has been 
> magically disappeared while he tried with copying more vcpu registers and 
> lots of printks places to places.

printks change all sorts of things, like timing, their own use of
barriers (e.g. in the serial driver) and simply adding more code which
increases the gap in the instruction stream between memory accesses
which might require a barrier such that the unwanted ordering no longer
happens.

>  At this moment, I'm not sure that this is the 
> common problem in xen or the problem due to poor migration, but I'm more 
> believing that maybe it is due to the poor migration. Since I'm keep 
> investigating
> tihs issue, I will tell you if anything turns out.

Thanks.

> 
> > I think in theory it should *BUT* (and it's a big But)...
> > 
> > ... it doesn't work for me on an issue I've been seeing booting dom0 on
> > the Foundation model. However doing the dmb in map_domain_page() *twice*
> 
> BTW, did you put barriers in map_domain_page? 
> since the patch below looks like in unmap_domain_page.

The first hunk is map_domain_page.

> > does work for me. In fact doing a dmb+nop works too. This is
> > inexplicable to me. It may be a model bug or it may (more likely) be
> > indicative of some other underlying issue. I also tried dsb() instead of
> > dmb() and that doesn't make a difference.
> > 
> > Anyway, I'm interested about its behaviour on real hardware.
> > 
> > These are doing full system barriers. I think in reality only a "dmb
> > nsh" should be needed for this use case, since we only care about the
> > local processor. I didn't want to go there when the supposedly more
> > obvious case wasn't doing as I expected!



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.