[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/vpmu: fix race-condition in vpmu_load
On 22.09.2022 15:35, Tamas K Lengyel wrote: > On Tue, Sep 20, 2022 at 2:27 PM Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx> > wrote: > >> >> >> On Tue, Sep 20, 2022 at 2:12 PM Boris Ostrovsky < >> boris.ostrovsky@xxxxxxxxxx> wrote: >> >>> >>> On 9/20/22 10:54 AM, Jan Beulich wrote: >>>> On 20.09.2022 16:26, Boris Ostrovsky wrote: >>>>> On 9/20/22 4:01 AM, Jan Beulich wrote: >>>>>> On 20.09.2022 00:42, Boris Ostrovsky wrote: >>>>>>> It is saving vpmu data from current pcpu's MSRs for a remote vcpu so >>> @v >>>>>>> in vmx_find_msr() is not @current: >>>>>>> >>>>>>> vpmu_load() >>>>>>> ... >>>>>>> prev = per_cpu(last_vcpu, pcpu); >>>>>>> vpmu_save_force(prev) >>>>>>> core2_vpmu_save() >>>>>>> __core2_vpmu_save() >>>>>>> vmx_read_guest_msr() >>>>>>> vmx_find_msr() >>>>>>> >>>>>>> >>>>>>> The call to vmx_find_msr() was introduced by 755087eb9b10c. I wonder >>> though whether >>>>>>> this call is needed when code path above is executed (i.e. when we >>> are saving >>>>>>> remove vcpu) >>>>>> How could it not be needed? We need to obtain the guest value. The >>>>>> thing I don't understand is why this forced saving is necessary, >>>>>> when context_switch() unconditionally calls vpmu_switch_from(). >>>>> >>>>> IIRC the logic is: >>>>> >>>>> 1. vcpuA runs on pcpu0 >>>>> 2. vcpuA is de-scheduled and is selected to run on pcpu1. It has not >>> yet called vpmu_load() from pcpu1 >>>> The calling of vpmu_load() shouldn't matter here. What does matter is >>>> that vpmu_save() was necessarily called already. >>> >>> >>> I thought we don't always save MSRs on context switch because we optimize >>> for case when the same vcpu gets scheduled again. But I am not sure I see >>> this now that I am looking at the code. >>> >> >> I see context_switch calling vpmu_save_from before __context_switch, but >> if that did actually save the vPMU state it would have reset >> VPMU_CONTEXT_LOADED, so by the time vpmu_load calls vpmu_save_force it >> would have just bailed before hitting the ASSERT failure in vmx_find_msrs. >> The context must still be loaded at that point (I'm trying to verify right >> now by adding some printks). >> > > OK, Boris was correct above, MSRs are not saved on context switch > automatically because of that optimization. VPMU_CONTEXT_SAVE isn't set, so > the only thing vpmu_switch_from does is set global control to 0 in case it > was a PV vCPU (see core2_vpmu_save checks for both VPMU_CONTEXT_SAVE and > VPMU_CONTEXT_LOADED) and vpmu_switch_from doesn't set VPMU_CONTEXT_SAVE. So > for HVM vCPUs it does nothing, that's why we still see the context still > loaded when we get to vpmu_load. > > It would be a simple enough change to make vpmu_switch_from always save the > context and then we could get rid of vpmu_load trying to do it later and > getting into that ASSERT failure. Thoughts? I'd much prefer that over e.g. the vCPU-pausing approach. I also think vpmu_save() is misnamed if it might not really save anything. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |