[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Regression with vcpu runstate info and XEN_RUNSTATE_UPDATE
On 23.09.19 12:12, Jan Beulich wrote: On 23.09.2019 11:56, Jürgen Groß wrote:On 23.09.19 11:51, Jan Beulich wrote:On 23.09.2019 11:42, Jürgen Groß wrote:On 16.09.19 17:44, Jan Beulich wrote:On 16.09.2019 16:50, Andrew Cooper wrote:After a complicated investigation, it turns out that c/s 2529c850ea48 broke xc_vcpu_getinfo(). The bug looks as if it is in vcpu_runstate_get(), which doesn't account for XEN_RUNSTATE_UPDATE and calculating a wildly inappropriate delta. Ultimately, the result of XEN_DOMCTL_getvcpuinfo ends up very occasionally with op->u.getvcpuinfo.cpu_time being wrong by 1 << 63. Given some of the callers of vcpu_runstate_get(), I don't think it is reasonable to pause the VCPU while reading the runstate info. However, it is also unclear whether waiting for XEN_RUNSTATE_UPDATE to drop in vcpu_runstate_get() is safe either.First and foremost I'm wondering whether simply masking off XEN_RUNSTATE_UPDATE in vcpu_runstate_get() wouldn't be an option. The assumption of the feature as a whole is for the high bit to never be set in an actual time value, after all. The other option I'd see is for vcpu_runstate_get() to gain a boolean return type by which it would indicate to interested callers whether the latching of the values happened while an update was in progress elsewhere. Callers needing to consume the potentially incorrect result could then choose to wait or schedule a hypercall continuation. The 3rd option (less desirable imo not the least because it would require touching all callers) would be for the function to gain a parameter telling it whether to spin until XEN_RUNSTATE_UPDATE is observed clear.And the 4th option would be to let vcpu_runstate_get() operate on a local runstate copy in order to avoid setting XEN_RUNSTATE_UPDATE in the "official" runstate info of the vcpu.But it already does - first thing it does is a memcpy() from the "official" instance to a caller provided buffer. (What is confusing, at least to me, is that the lock gets dropped last, when everything after the memcpy() already acts on the copy only.)Oh, that was my fault here: I meant to let update_runstate_area() operate on a local copy, of course.Ah, I see. It was my understanding that by setting the flag in the "official" instance, internal consumers could (if they cared) also avoid acting on inconsistent / in-flight data. Or was the current solution chosen exclusively because it was easiest to set the flag in the master instance, and then copy from there? Yes. The flag is meant only to be of interest for the guest reading the runstate area of another vcpu. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |