Xen project Mailing List

Re: [Xen-devel] [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 5 Jun 2014 14:01:45 +0100

Cc: Keir Fraser <keir@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 05 Jun 2014 13:02:13 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 05/06/14 13:46, Jan Beulich wrote: >>>> On 04.06.14 at 19:26, <andrew.cooper3@xxxxxxxxxx> wrote: >> Despite my 'Reviewed-by' tag on c/s 65e3554908 "x86/PV: support data >> breakpoint extension registers", I have re-evaluated my position as far as >> the hypercall interface is concerned. >> >> Previously, for the sake of not modifying the migration code in libxc, >> XEN_DOMCTL_get_ext_vcpucontext would jump though hoops to return -ENOBUFS if >> and only if MSRs were in use and no buffer was present. >> >> This is fragile, and awkward from a toolstack point-of-view when actually >> sending MSR content in the migration stream. It also complicates fixing a >> further race condition, between querying the number of MSRs for a vcpu, and >> the vcpu touching a new one. >> >> As this code is still only in staging, take this opportunity to redesign the > s/staging/unstable/ Oops - yes. > >> --- a/xen/arch/x86/domctl.c >> +++ b/xen/arch/x86/domctl.c >> @@ -1341,6 +1341,132 @@ long arch_do_domctl( >> } >> break; >> >> + case XEN_DOMCTL_get_vcpu_msrs: >> + case XEN_DOMCTL_set_vcpu_msrs: >> + { >> + struct xen_domctl_vcpu_msrs *vmsrs = &domctl->u.vcpu_msrs; >> + struct xen_domctl_vcpu_msr msr; >> + struct vcpu *v; >> + uint32_t nr_msrs = 0; >> + >> + ret = -ESRCH; >> + if ( (vmsrs->vcpu >= d->max_vcpus) || >> + ((v = d->vcpu[vmsrs->vcpu]) == NULL) ) >> + break; >> + >> + ret = -EINVAL; >> + if ( (v == current) || /* no vcpu_pause() */ >> + !is_pv_domain(d) ) >> + break; >> + >> + /* Count maximum number of optional msrs. */ >> + if ( boot_cpu_has(X86_FEATURE_DBEXT) ) >> + nr_msrs += 4; >> + >> + if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs ) >> + { >> + /* NULL guest handle is a request for max size. */ >> + if ( guest_handle_is_null(vmsrs->msrs) || >> + (vmsrs->msr_count < nr_msrs) ) >> + { >> + vmsrs->msr_count = nr_msrs; >> + ret = guest_handle_is_null(vmsrs->msrs) ? 0 : -ENOBUFS; > I don't think you should be failing "get" if there is enough space in > the provided buffer to store the actually used number of MSRs. That > way the caller may make a first call with a few (rather than none at > all) entries, an grow the buffer only if this wasn't sufficient. I am not sure I agree. The MSRs are unordered in the buffer which the caller cannot control, and issuing a hypercall again with a larger buffer will rewrite it from the start again. The sole use of this hypercall needs to ensure that all MSRs are gotten, otherwise VM corruption will occur. Permitting a partial get will make the return value ambiguous for making this hypercall a single time and guessing at the size to use, although I suspect we are less interested in this problem. > >> + } >> + >> + vcpu_unpause(v); >> + >> + /* Check we didn't lie to userspace then overflow the >> buffer */ >> + BUG_ON(i > nr_msrs); >> + vmsrs->msr_count = i; >> + } >> + >> + copyback = 1; >> + } >> + else >> + { >> + ret = -EINVAL; >> + if ( vmsrs->msr_count > nr_msrs ) >> + break; > Similarly I don't think you should fail the operation simply based on a > too large count. Who knows when or why it may make sense for the > caller to specify multiple entries with the same index. > > But then again the way you do it we have a statically determined > maximum and don't need to worry about preemption here until a > really large number of MSRs would get handled. For now, this prevents looping to the end of a guest controlled uint32_t, as per some of the concerns in XSA-77. If in the future, there is a scenario where reloading the same MSR several times becomes valid, the logic can be re-addressed. > >> + >> + vcpu_pause(v); >> + >> + for ( i = 0; i < vmsrs->msr_count; ++i ) >> + { >> + ret = -EFAULT; >> + if ( copy_from_guest_offset(&msr, vmsrs->msrs, i, 1) ) >> + break; >> + >> + ret = -EINVAL; >> + if ( msr.reserved ) >> + break; >> + >> + switch ( msr.index ) >> + { >> + case MSR_AMD64_DR0_ADDRESS_MASK: >> + if ( !boot_cpu_has(X86_FEATURE_DBEXT) || >> + (msr.value >> 32) ) >> + break; >> + v->arch.pv_vcpu.dr_mask[0] = msr.value; >> + continue; >> + >> + case MSR_AMD64_DR1_ADDRESS_MASK ... >> + MSR_AMD64_DR3_ADDRESS_MASK: >> + if ( !boot_cpu_has(X86_FEATURE_DBEXT) || >> + (msr.value >> 32) ) >> + break; >> + msr.index -= MSR_AMD64_DR1_ADDRESS_MASK - 1; >> + v->arch.pv_vcpu.dr_mask[msr.index] = msr.value; >> + continue; >> + } >> + break; >> + } >> + >> + vcpu_unpause(v); >> + >> + if ( i == vmsrs->msr_count ) >> + ret = 0; > else { > vmsrs->msr_count = i > copyback = 1; > } > > to at once give the caller an indication which slot failed? > > Jan Ok - seems useful as a debugging measure. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.