[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 6/7] xen/arm: flush D-cache and I-cache when appropriate
At 16:53 +0100 on 26 Oct (1351270394), Stefano Stabellini wrote: > On Fri, 26 Oct 2012, Tim Deegan wrote: > > At 18:35 +0100 on 24 Oct (1351103740), Stefano Stabellini wrote: > > > > I don't think this is necessary - why not just pass va directly to the > > > > inline asm? We don't care what register it's in (and if we did I'm not > > > > convinced this would guarantee it was r0). > > > > > > > > > + asm volatile ( > > > > > + "dsb;" > > > > > + STORE_CP32(0, DCCMVAC) > > > > > + "isb;" > > > > > + : : "r" (r0) : "memory"); > > > > > > > > Does this need a 'memory' clobber? Can we get away with just saying it > > > > consumes *va as an input? All we need to be sure of is that the > > > > particular thing we're flushing has been written out; no need to stop > > > > any other optimizations. > > > > > > you are right on both points > > > > > > > I guess it might need to be re-cast as a macro so the compiler knows how > > > > big *va is? > > > > > > I don't think it is necessary, after all the size of a register has to > > > be the same of a virtual address > > > > But it's the size of the thing in memory that's being flushed that > > matters, not the size of the pointer to it! > > > > E.g. after a PTE write we > > need a 64-bit memory input operand to stop the compiler from hoisting > > any part of the PTE write past the cache flush. (well OK we explicitly use > > a 64-bit atomic write for PTE writes, but YKWIM). > > The implementation of write_pte is entirely in assembly so I doubt that > the compiler is going to reorder it. Augh! Yes, like I said, PTE writes are fine. > However I see your point in case of flush_xen_dcache_va. > Wouldn't a barrier() at the beginning of the function be enough? More than enough. That would be exactly equivalent to the "memory" clobber above. What I'm arguing for is a _less_ restrictive constraint, that only restricts delaying writes, and only affects the thing actually being flushed (whatever size that is). For larger regions we should have a function with a single barrier at the top and then a loop of DCCMVAC writes. For single objects smaller than a cacheline we need to pass the object itself as a memory input operand. Probably we should also have a compile-time check that the object is smaller than the smallest supported cache-line (i.e. one DCCMVAC is enough). Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |