[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths



On Thu, 2017-11-09 at 03:17 -0700, Jan Beulich wrote:
> > > > On 09.11.17 at 10:54, <raistlin@xxxxxxxx> wrote:
> > 
> > On Tue, 2017-11-07 at 14:24 +0000, Igor Druzhinin wrote:
> > > Perhaps I should improve my diagram:
> > > 
> > > pCPU1: vCPUx of domain X -> migrate to pCPU2 -> switch to idle
> > > context
> > > -> RCU callbacks -> vcpu_destroy(vCPUy of domain Y) ->
> > > vmx_vcpu_disable_pml() -> vmx_vmcs_clear() (VMCS is trashed at this
> > > point on pCPU1)
> > > 
> > > pCPU2: context switch into vCPUx -> vCPUx.is_running = 1 -> TLB flush
> > > from context switch to clean TLB on pCPU1
> > > 
> > 
> > Sorry, there must be something I'm missing (or misunderstanding).
> > 
> > What is this code that checks is_running and triggers the TLB flush?
> 
> I don't see where Igor said is_running is being checked around a
> TLB flush. The TLB flush itself is what happens first thing in
> context_switch() (and it's really using the TLB flush interface to
> mainly effect the state flush, with the TLB flush being an implied
> side effect; I've already got a series of further patches to make
> this less implicit).
> 
> > But, more important, how come you are context switching to something
> > that has is_running == 1 ? That should not be possible.
> 
> That's not what Igor's diagram says - it's indicating the fact that
> is_running is being set to 1 in the process of context switching
> into vCPUx.

Jan, Dario,

Igor was referring to the following situation:


pCPU1                                   pCPU2
=====                                   =====
current == vCPU1
context_switch(next == idle)
!! __context_switch() is skipped
vcpu_migrate(vCPU1)
RCU callbacks
vmx_vcpu_destroy()
vmx_vcpu_disable_pml()
current_vmcs = 0

                                        schedule(next == vCPU1)
                                        vCPU1->is_running = 1;
                                        context_switch(next == vCPU1)
                                        flush_tlb_mask(&dirty_mask);
                                    
                                <--- IPI

__sync_local_execstate()
__context_switch(prev == vCPU1)
vmx_ctxt_switch_from(vCPU1)
vCPU1->is_running == 1
!! vmx_vmcs_reload() is skipped


I hope that this better illustrates the root cause.

-- 
Thanks,
Sergey
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.