[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Memory ordering question in the shutdown deferral code
On Mon, 21 Sep 2020, Julien Grall wrote: > On 21/09/2020 13:55, Durrant, Paul wrote: > > > (+ Xen-devel) > > > > > > Sorry I forgot to CC xen-devel. > > > > > > On 21/09/2020 12:38, Julien Grall wrote: > > > > Hi all, > > > > > > > > I have started to look at the deferral code (see > > > > vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and > > > > Arm will soon use it. > > > > > > > > The current implementation is using an smp_mb() to ensure ordering > > > > between a write then a read. The code looks roughly (I have slightly > > > > adapted it to make my question more obvious): > > > > > > > > domain_shutdown() > > > > d->is_shutting_down = 1; > > > > smp_mb(); > > > > if ( !vcpu0->defer_shutdown ) > > > > { > > > > vcpu_pause_nosync(v); > > > > v->paused_for_shutdown = 1; > > > > } > > > > > > > > vcpu_start_shutdown_deferral() > > > > vcpu0->defer_shutdown = 1; > > > > smp_mb(); > > > > if ( unlikely(d->is_shutting_down) ) > > > > vcpu_check_shutdown(v); > > > > > > > > return vcpu0->defer_shutdown; > > > > > > > > smp_mb() should only guarantee ordering (this may be stronger on some > > > > arch), so I think there is a race between the two functions. > > > > > > > > It would be possible to pause the vCPU in domain_shutdown() because > > > > vcpu0->defer_shutdown wasn't yet seen. > > > > > > > > Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down > > > > and therefore Xen may continue to send the I/O. Yet the vCPU will be > > > > paused so the I/O will never complete. > > > > > > > > The barrier enforces global order, right? > > It is not clear to me what you mean by "global ordering". This seems to > suggest a very expensive synchronization barrier between all the processors. > > From an arch-agnostic PoV, smp_mb() will enforce an ordering between > loads/stores but it doesn't guarantee *when* they will be observed. > > > So, if domain_shutdown() pauses the vcpu then is_shutting_down must > > necessarily be visible all cpus. > > That's not the guarantee provided by smp_mb() (see above). I simplified the code further to help us reason about it: thread#1 | thread#2 | 1) WRITE A | WRITE B 2) BARRIER | BARRIER 3) READ B | READ A I think it is (theoretically) possible for thread#1 to be at 1) and about to do 2), while thread#2 goes ahead and does 1) 2) 3). By the time thread#1 does 2), thread#2 has already completed the entire sequence. If thread#2 has already done 2), and thread#1 is about to do 3), then I think we are guaranteed that thread#1 will see the new value of B. Or is this the core of the issue we are discussing? If it works the way I wrote, then it would confirm Paul's view. For your information I went to check what the Linux memory model has to say about this. It says: "smp_mb() guarantees to restore sequential consistency among accesses that use READ_ONCE, WRITE_ONCE(), or stronger. For example, the following Linux-kernel code would forbid non-SC outcomes" It is interesting that they chose the words "restore sequential consistency". It would be difficult to come up with an example that has "sequential consistency" but doesn't work the way described earlier.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |