[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] mm: fix LLVM code-generation issue
On Thu, Nov 22, 2018 at 05:46:19PM +0000, Julien Grall wrote: > > > On 11/22/18 5:04 PM, George Dunlap wrote: > > On 11/22/18 4:45 PM, Julien Grall wrote: > > > Hi Roger, > > > > > > On 11/22/18 4:39 PM, Roger Pau Monné wrote: > > > > On Thu, Nov 22, 2018 at 04:22:34PM +0000, Andrew Cooper wrote: > > > > > On 22/11/2018 16:07, Roger Pau Monné wrote: > > > > > > On Thu, Nov 22, 2018 at 03:23:41PM +0000, Andrew Cooper wrote: > > > > > > > On 22/11/2018 15:20, Roger Pau Monné wrote: > > > > > > > > On Thu, Nov 22, 2018 at 02:03:55PM +0000, Julien Grall wrote: > > > > > > > > > Hi Jan, > > > > > > > > > > > > > > > > > > On 11/22/18 1:36 PM, Jan Beulich wrote: > > > > > > > > > > > > > On 22.11.18 at 14:31, <andrew.cooper3@xxxxxxxxxx> > > > > > > > > > > > > > wrote: > > > > > > > > > > > I think Julien's point is that without explicitly > > > > > > > > > > > barriers, CPU0's > > > > > > > > > > > update to system_state may not be visible on CPU1, even > > > > > > > > > > > though the > > > > > > > > > > > mappings have been shot down. > > > > > > > > > > > > > > > > > > > > > > Therefore, from the processors point of view, it did > > > > > > > > > > > everything > > > > > > > > > > > correctly, and hit a real pagefault. > > > > > > > > > > Boot time updates of system_state should be of no interest > > > > > > > > > > here, > > > > > > > > > > as at that time the APs are all idling. > > > > > > > > > That's probably true today. But this code looks rather > > > > > > > > > fragile as > > > > > > > > > you don't > > > > > > > > > know how this is going to be used in the future. > > > > > > > > > > > > > > > > > > If you decide to gate init code with system_state, then you > > > > > > > > > need > > > > > > > > > a barrier > > > > > > > > > to ensure the code is future proof. > > > > > > > > Wouldn't it be enough to declare system_state as volatile? > > > > > > > No. volatility (or lack thereof) is a compiler level construct. > > > > > > > > > > > > > > ARM has weaker cache coherency than x86, so a write which has > > > > > > > completed > > > > > > > on one CPU0 in the past may legitimately not be visible on CPU1 > > > > > > > yet. > > > > > > > > > > > > > > If you need guarantees about the visibility of updated, you must > > > > > > > use > > > > > > > appropriate barriers. > > > > > > Right. There's some differences between ARM and x86, ARM sets > > > > > > SYS_STATE_active and continues to make use of init functions. In any > > > > > > case I have the following diff: > > > > > > > > > > > > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c > > > > > > index e83221ab79..cf50d05620 100644 > > > > > > --- a/xen/arch/arm/setup.c > > > > > > +++ b/xen/arch/arm/setup.c > > > > > > @@ -966,6 +966,7 @@ void __init start_xen(unsigned long > > > > > > boot_phys_offset, > > > > > > serial_endboot(); > > > > > > system_state = SYS_STATE_active; > > > > > > + smp_wmb(); > > > > > > create_domUs(); > > > > > > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > > > > > > index 9cbff22fb3..41044c0b6f 100644 > > > > > > --- a/xen/arch/x86/setup.c > > > > > > +++ b/xen/arch/x86/setup.c > > > > > > @@ -593,6 +593,7 @@ static void noinline init_done(void) > > > > > > unsigned long start, end; > > > > > > system_state = SYS_STATE_active; > > > > > > + smp_wmb(); > > > > > > domain_unpause_by_systemcontroller(dom0); > > > > > > > > > > I'm afraid that that won't do anything to help at all. > > > > > > > > > > smp_{wmb,rmb}() must be in matched pairs, and mb() must be matched > > > > > with > > > > > itself. > > > > > > > > Then I'm not sure about whether our previous plan still stands, are we > > > > OK with using ACCESS_ONCE here and forgetting about the memory > > > > barriers given the current usage? > > > > > > The problem is not the current usage but how it could be used. Debugging > > > memory ordering is quite a pain so I would prefer this to be fixed > > > correctly. > > > > But in this case it wouldn't be a pain, because the only possible > > failure mode is if the processor faults trying to read opt_bootscrub, right? > > Possibly. But I don't see any reason to defer the fix until someone comes up > with unreliable crash. If we have to go down that route, shouldn't we also protect system_state with a lock so that it cannot be modified by a CPU while it's being read from another? Image an AP reads system_state < SYS_STATE_active, then BSP sets system_state = SYS_STATE_active and clears the init mappings, then when the AP tries to read an init variable it would get a page fault. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |