[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Regression with vcpu runstate info and XEN_RUNSTATE_UPDATE
On 23.09.2019 11:56, Jürgen Groß wrote: > On 23.09.19 11:51, Jan Beulich wrote: >> On 23.09.2019 11:42, Jürgen Groß wrote: >>> On 16.09.19 17:44, Jan Beulich wrote: >>>> On 16.09.2019 16:50, Andrew Cooper wrote: >>>>> After a complicated investigation, it turns out that c/s 2529c850ea48 >>>>> broke xc_vcpu_getinfo(). >>>>> >>>>> The bug looks as if it is in vcpu_runstate_get(), which doesn't account >>>>> for XEN_RUNSTATE_UPDATE and calculating a wildly inappropriate delta. >>>>> Ultimately, the result of XEN_DOMCTL_getvcpuinfo ends up very >>>>> occasionally with op->u.getvcpuinfo.cpu_time being wrong by 1 << 63. >>>>> >>>>> Given some of the callers of vcpu_runstate_get(), I don't think it is >>>>> reasonable to pause the VCPU while reading the runstate info. However, >>>>> it is also unclear whether waiting for XEN_RUNSTATE_UPDATE to drop in >>>>> vcpu_runstate_get() is safe either. >>>> >>>> First and foremost I'm wondering whether simply masking off >>>> XEN_RUNSTATE_UPDATE in vcpu_runstate_get() wouldn't be an >>>> option. The assumption of the feature as a whole is for the >>>> high bit to never be set in an actual time value, after all. >>>> >>>> The other option I'd see is for vcpu_runstate_get() to gain >>>> a boolean return type by which it would indicate to >>>> interested callers whether the latching of the values >>>> happened while an update was in progress elsewhere. Callers >>>> needing to consume the potentially incorrect result could >>>> then choose to wait or schedule a hypercall continuation. >>>> >>>> The 3rd option (less desirable imo not the least because it >>>> would require touching all callers) would be for the function >>>> to gain a parameter telling it whether to spin until >>>> XEN_RUNSTATE_UPDATE is observed clear. >>> >>> And the 4th option would be to let vcpu_runstate_get() operate on a >>> local runstate copy in order to avoid setting XEN_RUNSTATE_UPDATE in >>> the "official" runstate info of the vcpu. >> >> But it already does - first thing it does is a memcpy() from the >> "official" instance to a caller provided buffer. (What is >> confusing, at least to me, is that the lock gets dropped last, >> when everything after the memcpy() already acts on the copy only.) > > Oh, that was my fault here: I meant to let update_runstate_area() > operate on a local copy, of course. Ah, I see. It was my understanding that by setting the flag in the "official" instance, internal consumers could (if they cared) also avoid acting on inconsistent / in-flight data. Or was the current solution chosen exclusively because it was easiest to set the flag in the master instance, and then copy from there? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |