[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Regression with vcpu runstate info and XEN_RUNSTATE_UPDATE



On 23.09.2019 11:56, Jürgen Groß wrote:
> On 23.09.19 11:51, Jan Beulich wrote:
>> On 23.09.2019 11:42, Jürgen Groß wrote:
>>> On 16.09.19 17:44, Jan Beulich wrote:
>>>> On 16.09.2019 16:50, Andrew Cooper wrote:
>>>>> After a complicated investigation, it turns out that c/s 2529c850ea48
>>>>> broke xc_vcpu_getinfo().
>>>>>
>>>>> The bug looks as if it is in vcpu_runstate_get(), which doesn't account
>>>>> for XEN_RUNSTATE_UPDATE and calculating a wildly inappropriate delta.
>>>>> Ultimately, the result of XEN_DOMCTL_getvcpuinfo ends up very
>>>>> occasionally with op->u.getvcpuinfo.cpu_time being wrong by 1 << 63.
>>>>>
>>>>> Given some of the callers of vcpu_runstate_get(), I don't think it is
>>>>> reasonable to pause the VCPU while reading the runstate info.  However,
>>>>> it is also unclear whether waiting for XEN_RUNSTATE_UPDATE to drop in
>>>>> vcpu_runstate_get() is safe either.
>>>>
>>>> First and foremost I'm wondering whether simply masking off
>>>> XEN_RUNSTATE_UPDATE in vcpu_runstate_get() wouldn't be an
>>>> option. The assumption of the feature as a whole is for the
>>>> high bit to never be set in an actual time value, after all.
>>>>
>>>> The other option I'd see is for vcpu_runstate_get() to gain
>>>> a boolean return type by which it would indicate to
>>>> interested callers whether the latching of the values
>>>> happened while an update was in progress elsewhere. Callers
>>>> needing to consume the potentially incorrect result could
>>>> then choose to wait or schedule a hypercall continuation.
>>>>
>>>> The 3rd option (less desirable imo not the least because it
>>>> would require touching all callers) would be for the function
>>>> to gain a parameter telling it whether to spin until
>>>> XEN_RUNSTATE_UPDATE is observed clear.
>>>
>>> And the 4th option would be to let vcpu_runstate_get() operate on a
>>> local runstate copy in order to avoid setting XEN_RUNSTATE_UPDATE in
>>> the "official" runstate info of the vcpu.
>>
>> But it already does - first thing it does is a memcpy() from the
>> "official" instance to a caller provided buffer. (What is
>> confusing, at least to me, is that the lock gets dropped last,
>> when everything after the memcpy() already acts on the copy only.)
> 
> Oh, that was my fault here: I meant to let update_runstate_area()
> operate on a local copy, of course.

Ah, I see. It was my understanding that by setting the flag in the
"official" instance, internal consumers could (if they cared) also
avoid acting on inconsistent / in-flight data. Or was the current
solution chosen exclusively because it was easiest to set the flag
in the master instance, and then copy from there?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.