[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths
On Thu, 2017-11-09 at 06:08 -0700, Jan Beulich wrote: > > > > On 09.11.17 at 12:01, <raistlin@xxxxxxxx> wrote: > > > > pCPU1 > > ===== > > current == vCPU1 > > context_switch(next == idle) > > !! __context_switch() is skipped > > vcpu_migrate(vCPU1) > > anything_that_uses_or_touches_context() > > > > So, it must be anything_that_uses_or_touches_context() --knowing it > > will be reading or touching the pCPU context-- that syncs the > > state, > > before using or touching it (which is, e.g., what Jan's patch > > does). > > Well, generally after the vcpu_migrate(vCPU1) above we expect > nothing to happen at all on the pCPU, until another (non-idle) > vCPU gets scheduled onto it. > Ah, yes, my bad! What if I take vcpu_migrate() out of the above exec- trace (which is what I wanted to do in my email already)? pCPU1 ===== current == vCPU1 context_switch(next == idle) !! __context_switch() is skipped anything_that_uses_or_touches_context() Point being, is the underlying and general "issue" here, really bound to migrations, or is it something intrinsic of lazy context switch? I'm saying it's the latter. That being said, sure it makes sense to assume that, if we migrated the vCPU from pCPU1 to pCPU2, it's highly unlikely that it will resume the execution on pCPU1, and hence there is no point in leaving its context there, and we could very well sync. It's a reasonable best-effort measure, but can we rely on it for correctness? I don't think we can. And generalizing the idea enough that we could then rely on it for correctness, tends to be close enough to not doing lazy context switch at all, I think. > The problem is with tasklet / softirq > (and hence also RCU) work. > Yes. > Tasklets already take care of this by > calling sync_local_execstate() before calling the handler. But > for softirqs this isn't really an option; I'm surprised to see that > tasklet code does this independently of what kind of tasklet that > is. > Good point. Weird indeed. > Softirq tasklets aren't used very often, so I wonder if we have > a latent bug here. Otoh, if this was actually fine, adding a similar > call to rcu_do_batch() (or its caller) would be an option, I think. > We can have a look. What about the effect on performance, though? I mean, assuming that lazy context switch does a good job, with respect to that, by avoiding synching in enough case where it is actually not necessary, how do things change if we start to sync at any softirq, even when the handler would have not required that? Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |