[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 19 Dec 2023 16:24:02 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 19 Dec 2023 15:24:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19.12.2023 15:36, Roger Pau Monné wrote:
> On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
>> ..., at least as reasonably feasible without making a check hook
>> mandatory (in particular strict vs relaxed/zero-extend length checking
>> can't be done early this way).
>>
>> Note that only one of the two uses of "real" hvm_load() is accompanied
>> with a "checking" one. The other directly consumes hvm_save() output,
>> which ought to be well-formed. This means that while input data related
>> checks don't need repeating in the "load" function when already done by
>> the "check" one (albeit assertions to this effect may be desirable),
>> domain state related checks (e.g. has_xyz(d)) will be required in both
>> places.
>>
>> Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> Now that this re-arranges hvm_load() anyway, wouldn't it be better to
>> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
>> arrange for)?
> 
> Seems OK to me.

As is, or with the suggested adjustment, or either way?

>> Do we really need all the copying involved in use of _hvm_read_entry()
>> (backing hvm_load_entry()? Zero-extending loads are likely easier to
>> handle that way, but for strict loads all we gain is a reduced risk of
>> unaligned accesses (compared to simply pointing into h->data[]).
> 
> I do feel it's safer to copy the data so the checks are done against
> what's loaded.  Albeit hvm_load() is already using hvm_get_entry().

The comment is about individual handlers, not hvm_load() itself. In some
cases - when copying directly into guest state structures - the copying
makes sense. In other cases, where there is separate copying anyway,
things could be done with less duplication of data (see hpet_load(),
which was already converted; along those lines hvm_load() itself also
was already switched away from the original copying).

Checking in any event can't be done against what _is_ loaded (as during
checking we want to avoid altering guest state); it'll always be only
against what is going to be loaded. The difference would be whether we
check data in the incoming buffer or in a copy of that data in a local
variable on the stack. But that applies to checking done in the load
hooks only anyway; the cases with split off check handlers should never
need to do any copying.

>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -379,8 +379,12 @@ long arch_do_domctl(
>>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) 
>> != 0 )
>>              goto sethvmcontext_out;
>>  
>> +        ret = hvm_load(d, false, &c);
>> +        if ( ret )
>> +            goto sethvmcontext_out;
>> +
>>          domain_pause(d);
>> -        ret = hvm_load(d, &c);
>> +        ret = hvm_load(d, true, &c);
> 
> Now that the check has been done ahead, do we want to somehow assert
> that this cannot fail?  AIUI that's the expectation.

We certainly can't until all checking was moved out of the load handlers.
And even then I think there are still cases where load might produce an
error. (In fact I would have refused a little more strongly to folding
the prior hvm_check() into hvm_load() if indeed a separate hvm_load()
could have ended up returning void in the long run.)

>> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
>>      return 0;
>>  }
>>  
>> -int hvm_load(struct domain *d, hvm_domain_context_t *h)
>> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)
> 
> Maybe the 'real' parameter should instead be an enum:
> 
> enum hvm_load_action {
>     CHECK,
>     LOAD,
> };
> int hvm_load(struct domain *d, enum hvm_load_action action,
>              hvm_domain_context_t *h);

Hmm, yes, it could. I'm not a fan of enums for boolean-like things,
though.

> Otherwise a comment might be warranted about how 'real' affects the
> logic in the function.

I can certainly add a comment immediately ahead of the function:

/*
 * @real = false requests checking of the incoming state, while @real = true
 * requests actual loading, which will then assume that checking was already
 * done or is unnecessary.
 */

>> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
>>      if ( !hdr )
>>          return -ENODATA;
>>  
>> -    rc = arch_hvm_load(d, hdr);
>> -    if ( rc )
>> -        return rc;
>> +    rc = arch_hvm_check(d, hdr);
> 
> Shouldn't this _check function only be called when real == false?

Possibly. In v4 I directly transformed what I had in v3:

    ASSERT(!arch_hvm_check(d, hdr));

I.e. it is now the call above plus ...

>> +    if ( real )
>> +    {
>> +        struct vcpu *v;
>> +
>> +        ASSERT(!rc);

... this assertion. Really the little brother of the call site assertion
you're asking for (see above).

>> +        arch_hvm_load(d, hdr);
>>  
>> -    /* Down all the vcpus: we only re-enable the ones that had state saved. 
>> */
>> -    for_each_vcpu(d, v)
>> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> -            vcpu_sleep_nosync(v);
>> +        /*
>> +         * Down all the vcpus: we only re-enable the ones that had state
>> +         * saved.
>> +         */
>> +        for_each_vcpu(d, v)
>> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> +                vcpu_sleep_nosync(v);
>> +    }
>> +    else if ( rc )
>> +        return rc;
>>  
>>      for ( ; ; )
>>      {
>> +        const char *name;
>> +        hvm_load_handler load;
>> +
>>          if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
>>          {
>>              /* Run out of data */
>>              printk(XENLOG_G_ERR
>>                     "HVM%d restore: save did not end with a null entry\n",
>>                     d->domain_id);
>> +            ASSERT(!real);
>>              return -ENODATA;
>>          }
>>  
>>          /* Read the typecode of the next entry  and check for the 
>> end-marker */
>>          desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
>> -        if ( desc->typecode == 0 )
>> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
>> +        {
>> +            /* Reset cursor for hvm_load(, true, ). */
>> +            if ( !real )
>> +                h->cur = 0;
>>              return 0;
>> +        }
>>  
>>          /* Find the handler for this entry */
>> -        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
>> -             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
>> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
>> +             !(name = hvm_sr_handlers[desc->typecode].name) ||
>> +             !(load = hvm_sr_handlers[desc->typecode].load) )
>>          {
>>              printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode 
>> %u\n",
>>                     d->domain_id, desc->typecode);
> 
> The message is not very accurate here, it does fail when the typecode
> is unknown, but also fails when such typecode has no name or load
> function setup.

Yes and no, and it's not changing in this patch. Are you suggesting I should
change it despite being unrelated? If so, there not being a name (which is
the new check I'm adding) still suggests the code is unknown. There not being
a load handler really indicates a bug in Xen (yet no reason to e.g. BUG() in
that case, the failed loading will hopefully be noticeable enough).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.