Xen project Mailing List

Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 19 Dec 2023 16:24:02 +0100

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Tue, 19 Dec 2023 15:24:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19.12.2023 15:36, Roger Pau Monné wrote: > On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote: >> ..., at least as reasonably feasible without making a check hook >> mandatory (in particular strict vs relaxed/zero-extend length checking >> can't be done early this way). >> >> Note that only one of the two uses of "real" hvm_load() is accompanied >> with a "checking" one. The other directly consumes hvm_save() output, >> which ought to be well-formed. This means that while input data related >> checks don't need repeating in the "load" function when already done by >> the "check" one (albeit assertions to this effect may be desirable), >> domain state related checks (e.g. has_xyz(d)) will be required in both >> places. >> >> Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >> --- >> Now that this re-arranges hvm_load() anyway, wouldn't it be better to >> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to >> arrange for)? > > Seems OK to me. As is, or with the suggested adjustment, or either way? >> Do we really need all the copying involved in use of _hvm_read_entry() >> (backing hvm_load_entry()? Zero-extending loads are likely easier to >> handle that way, but for strict loads all we gain is a reduced risk of >> unaligned accesses (compared to simply pointing into h->data[]). > > I do feel it's safer to copy the data so the checks are done against > what's loaded. Albeit hvm_load() is already using hvm_get_entry(). The comment is about individual handlers, not hvm_load() itself. In some cases - when copying directly into guest state structures - the copying makes sense. In other cases, where there is separate copying anyway, things could be done with less duplication of data (see hpet_load(), which was already converted; along those lines hvm_load() itself also was already switched away from the original copying). Checking in any event can't be done against what _is_ loaded (as during checking we want to avoid altering guest state); it'll always be only against what is going to be loaded. The difference would be whether we check data in the incoming buffer or in a copy of that data in a local variable on the stack. But that applies to checking done in the load hooks only anyway; the cases with split off check handlers should never need to do any copying. >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -379,8 +379,12 @@ long arch_do_domctl( >> if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) >> != 0 ) >> goto sethvmcontext_out; >> >> + ret = hvm_load(d, false, &c); >> + if ( ret ) >> + goto sethvmcontext_out; >> + >> domain_pause(d); >> - ret = hvm_load(d, &c); >> + ret = hvm_load(d, true, &c); > > Now that the check has been done ahead, do we want to somehow assert > that this cannot fail? AIUI that's the expectation. We certainly can't until all checking was moved out of the load handlers. And even then I think there are still cases where load might produce an error. (In fact I would have refused a little more strongly to folding the prior hvm_check() into hvm_load() if indeed a separate hvm_load() could have ended up returning void in the long run.) >> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai >> return 0; >> } >> >> -int hvm_load(struct domain *d, hvm_domain_context_t *h) >> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h) > > Maybe the 'real' parameter should instead be an enum: > > enum hvm_load_action { > CHECK, > LOAD, > }; > int hvm_load(struct domain *d, enum hvm_load_action action, > hvm_domain_context_t *h); Hmm, yes, it could. I'm not a fan of enums for boolean-like things, though. > Otherwise a comment might be warranted about how 'real' affects the > logic in the function. I can certainly add a comment immediately ahead of the function: /* * @real = false requests checking of the incoming state, while @real = true * requests actual loading, which will then assume that checking was already * done or is unnecessary. */ >> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai >> if ( !hdr ) >> return -ENODATA; >> >> - rc = arch_hvm_load(d, hdr); >> - if ( rc ) >> - return rc; >> + rc = arch_hvm_check(d, hdr); > > Shouldn't this _check function only be called when real == false? Possibly. In v4 I directly transformed what I had in v3: ASSERT(!arch_hvm_check(d, hdr)); I.e. it is now the call above plus ... >> + if ( real ) >> + { >> + struct vcpu *v; >> + >> + ASSERT(!rc); ... this assertion. Really the little brother of the call site assertion you're asking for (see above). >> + arch_hvm_load(d, hdr); >> >> - /* Down all the vcpus: we only re-enable the ones that had state saved. >> */ >> - for_each_vcpu(d, v) >> - if ( !test_and_set_bit(_VPF_down, &v->pause_flags) ) >> - vcpu_sleep_nosync(v); >> + /* >> + * Down all the vcpus: we only re-enable the ones that had state >> + * saved. >> + */ >> + for_each_vcpu(d, v) >> + if ( !test_and_set_bit(_VPF_down, &v->pause_flags) ) >> + vcpu_sleep_nosync(v); >> + } >> + else if ( rc ) >> + return rc; >> >> for ( ; ; ) >> { >> + const char *name; >> + hvm_load_handler load; >> + >> if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) ) >> { >> /* Run out of data */ >> printk(XENLOG_G_ERR >> "HVM%d restore: save did not end with a null entry\n", >> d->domain_id); >> + ASSERT(!real); >> return -ENODATA; >> } >> >> /* Read the typecode of the next entry and check for the >> end-marker */ >> desc = (struct hvm_save_descriptor *)(&h->data[h->cur]); >> - if ( desc->typecode == 0 ) >> + if ( desc->typecode == HVM_SAVE_CODE(END) ) >> + { >> + /* Reset cursor for hvm_load(, true, ). */ >> + if ( !real ) >> + h->cur = 0; >> return 0; >> + } >> >> /* Find the handler for this entry */ >> - if ( (desc->typecode > HVM_SAVE_CODE_MAX) || >> - ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) ) >> + if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) || >> + !(name = hvm_sr_handlers[desc->typecode].name) || >> + !(load = hvm_sr_handlers[desc->typecode].load) ) >> { >> printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode >> %u\n", >> d->domain_id, desc->typecode); > > The message is not very accurate here, it does fail when the typecode > is unknown, but also fails when such typecode has no name or load > function setup. Yes and no, and it's not changing in this patch. Are you suggesting I should change it despite being unrelated? If so, there not being a name (which is the new check I'm adding) still suggests the code is unknown. There not being a load handler really indicates a bug in Xen (yet no reason to e.g. BUG() in that case, the failed loading will hopefully be noticeable enough). Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.