[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/5] public/sysctl: Clarify usage of pm_{px,cx}_stat



On 20.05.2025 16:25, Ross Lagerwall wrote:
> On Tue, May 13, 2025 at 3:43 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>
>> On 12.05.2025 16:46, Ross Lagerwall wrote:
>>> --- a/xen/include/public/sysctl.h
>>> +++ b/xen/include/public/sysctl.h
>>> @@ -215,23 +215,51 @@ typedef struct pm_px_val pm_px_val_t;
>>>  DEFINE_XEN_GUEST_HANDLE(pm_px_val_t);
>>>
>>>  struct pm_px_stat {
>>> -    uint8_t total;        /* total Px states */
>>> -    uint8_t usable;       /* usable Px states */
>>> -    uint8_t last;         /* last Px state */
>>> -    uint8_t cur;          /* current Px state */
>>> -    XEN_GUEST_HANDLE_64(uint64) trans_pt;   /* Px transition table */
>>> +    /*
>>> +     * IN: Number of elements in pt, number of rows/columns in trans_pt
>>> +     *     (PMSTAT_get_pxstat)
>>> +     * OUT: total Px states (PMSTAT_get_max_px, PMSTAT_get_pxstat)
>>> +     */
>>> +    uint8_t total;
>>
>> The part for this field ought to go in patch 1, as already indicated there.
>>
>>> +    uint8_t usable;       /* OUT: usable Px states (PMSTAT_get_pxstat) */
>>> +    uint8_t last;         /* OUT: last Px state (PMSTAT_get_pxstat) */
>>> +    uint8_t cur;          /* OUT: current Px state (PMSTAT_get_pxstat) */
>>> +    /*
>>> +     * OUT: Px transition table. This should have total * total elements.
>>> +     *      This will not be copied if it is smaller than the hypervisor's
>>> +     *      Px transition table. (PMSTAT_get_pxstat)
>>> +     */
>>> +    XEN_GUEST_HANDLE_64(uint64) trans_pt;
>>> +    /* OUT: This should have total elements (PMSTAT_get_pxstat) */
>>>      XEN_GUEST_HANDLE_64(pm_px_val_t) pt;
>>
>> As also indicated there, the same constraint as for trans_pt applies to this
>> output buffer, just that it's having only one dimension.
>>
>>>  };
>>>
>>>  struct pm_cx_stat {
>>> -    uint32_t nr;    /* entry nr in triggers & residencies, including C0 */
>>> -    uint32_t last;  /* last Cx state */
>>> -    uint64_aligned_t idle_time;                 /* idle time from boot */
>>> -    XEN_GUEST_HANDLE_64(uint64) triggers;    /* Cx trigger counts */
>>> -    XEN_GUEST_HANDLE_64(uint64) residencies; /* Cx residencies */
>>> -    uint32_t nr_pc;                          /* entry nr in pc[] */
>>> -    uint32_t nr_cc;                          /* entry nr in cc[] */
>>>      /*
>>> +     * IN:  Number of elements in triggers, residencies (PMSTAT_get_cxstat)
>>> +     * OUT: entry nr in triggers & residencies, including C0
>>> +     *      (PMSTAT_get_cxstat, PMSTAT_get_max_cx)
>>> +     */
>>> +    uint32_t nr;
>>> +    uint32_t last;  /* OUT: last Cx state (PMSTAT_get_cxstat) */
>>> +    /* OUT: idle time from boot (PMSTAT_get_cxstat)*/
>>> +    uint64_aligned_t idle_time;
>>> +    /* OUT: Cx trigger counts, nr elements (PMSTAT_get_cxstat) */
>>> +    XEN_GUEST_HANDLE_64(uint64) triggers;
>>> +    /* OUT: Cx residencies, nr elements (PMSTAT_get_cxstat) */
>>> +    XEN_GUEST_HANDLE_64(uint64) residencies;
>>> +    /*
>>> +     * IN: entry nr in pc[] (PMSTAT_get_cxstat)
>>> +     * OUT: Index of highest non-zero entry set in pc[] (PMSTAT_get_cxstat)
>>> +     */
>>> +    uint32_t nr_pc;
>>> +    /*
>>> +     * IN: entry nr in cc[] (PMSTAT_get_cxstat)
>>> +     * OUT: Index of highest non-zero entry set in cc[] (PMSTAT_get_cxstat)
>>> +     */
>>
>> For both of these, it's not "highest non-zero" but, according to ...
>>
>>> +    uint32_t nr_cc;
>>> +    /*
>>> +     * OUT: (PMSTAT_get_cxstat)
>>>       * These two arrays may (and generally will) have unused slots; slots 
>>> not
>>>       * having a corresponding hardware register will not be written by the
>>>       * hypervisor. It is therefore up to the caller to put a suitable 
>>> sentinel
>>
>> ... this comment, "highest written by hypervisor". They're also not "index 
>> of",
>> but "one higher than the index of" (i.e. counts, not indexes).
>>
> 
> Looking at this again, I don't think that matches what Xen does (nor
> does my previous attempt). The code in question:
> 
> #define PUT_xC(what, n) do { \
>         if ( stat->nr_##what >= n && \
>              copy_to_guest_offset(stat->what, n - 1, &hw_res.what##n, 1) ) \
>             return -EFAULT; \
>         if ( hw_res.what##n ) \
>             nr_##what = n; \
>     } while ( 0 )

Right. I should have inserted "that could be" in my reply.

> Xen will copy all the hardware registers that it knows about (regardless
> of whether the hardware actually has them) and will return in nr_pc /
> nr_cc the index + 1 of the highest non-zero entry it _would_ have
> written if there is sufficient space.

Which is kind of bogus when the last (or more) of those would merely be
zero.

> I could describe it simply as "OUT: Required size of cc[]" ?

Maybe append something like "..., for all known to Xen entries to be
written"? Provided we don't want to change the behavior anyway.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.