[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths
>>> On 09.11.17 at 15:16, <raistlin@xxxxxxxx> wrote: > Ah, yes, my bad! What if I take vcpu_migrate() out of the above exec- > trace (which is what I wanted to do in my email already)? > > pCPU1 > ===== > current == vCPU1 > context_switch(next == idle) > !! __context_switch() is skipped > anything_that_uses_or_touches_context() > > Point being, is the underlying and general "issue" here, really bound > to migrations, or is it something intrinsic of lazy context switch? I'm > saying it's the latter. The general issue doesn't require vcpu_migrate(), I agree. The specific VMX issue here does, though. Thing is - I'm not convinced there's a general issue here in the first place: Asynchronous code isn't supposed to modify state behind the back of synchronous code. It just so happens that VMX code is structured to violate that assumption when PML is in use. > That being said, sure it makes sense to assume that, if we migrated the > vCPU from pCPU1 to pCPU2, it's highly unlikely that it will resume the > execution on pCPU1, and hence there is no point in leaving its context > there, and we could very well sync. It's a reasonable best-effort > measure, but can we rely on it for correctness? I don't think we can. We can't right now, but code (from an abstract pov at least) could be structured so we could rely on it. > And generalizing the idea enough that we could then rely on it for > correctness, tends to be close enough to not doing lazy context switch > at all, I think. I don't think so, no - we could still leave state in hardware in anticipation that no other non-idle vCPU would be scheduled on the local CPU. That's something that ought to help in particular pinned vCPU-s. >> The problem is with tasklet / softirq >> (and hence also RCU) work. >> > Yes. > >> Tasklets already take care of this by >> calling sync_local_execstate() before calling the handler. But >> for softirqs this isn't really an option; I'm surprised to see that >> tasklet code does this independently of what kind of tasklet that >> is. >> > Good point. Weird indeed. I've added an item to my todo list to see whether I can figure whether this is an okay thing to do. >> Softirq tasklets aren't used very often, so I wonder if we have >> a latent bug here. Otoh, if this was actually fine, adding a similar >> call to rcu_do_batch() (or its caller) would be an option, I think. >> > We can have a look. > > What about the effect on performance, though? > > I mean, assuming that lazy context switch does a good job, with respect > to that, by avoiding synching in enough case where it is actually not > necessary, how do things change if we start to sync at any softirq, > even when the handler would have not required that? I wasn't suggesting this for softirqs, but only (if at all) for RCU. But yes, there would a performance hit from this; not sure how small or large. But as you can see, for tasklets the hit is taken unconditionally. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |