Xen project Mailing List

Re: [Xen-devel] [PATCH v6 01/10] xen: vnuma topology and subop hypercalls

On Wed, Jul 23, 2014 at 10:06 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>> On 18.07.14 at 07:50, <ufimtseva@xxxxxxxxx> wrote: >> +static int vnuma_alloc(struct vnuma_info **vnuma, >> + unsigned int nr_vnodes, >> + unsigned int nr_vcpus, >> + unsigned int dist_size) >> +{ >> + struct vnuma_info *v; >> + >> + if ( vnuma && *vnuma ) >> + return -EINVAL; >> + >> + v = *vnuma; >> + /* >> + * check if any of xmallocs exeeds PAGE_SIZE. >> + * If yes, consider it as an error for now. >> + */ >> + if ( nr_vnodes > PAGE_SIZE / sizeof(nr_vnodes) || >> + nr_vcpus > PAGE_SIZE / sizeof(nr_vcpus) || >> + nr_vnodes > PAGE_SIZE / sizeof(struct vmemrange) || >> + dist_size > PAGE_SIZE / sizeof(dist_size) ) > Hi Jan, Thank you for your review. > Three of the four checks are rather bogus - the types of the > variables just happen to match the types of the respective > array elements. Best to switch all of them to sizeof(*v->...). > Plus I'm not sure about the dist_size check - in its current shape > it's redundant with the nr_vnodes one (and really the function > parameter seems pointless, i.e. could be calculated here), and > it's questionable whether limiting that table against PAGE_SIZE > isn't too restrictive. Also indentation seems broken here. I agree on distance table memory allocation limit. The max vdistance (in current interface) table dimension will be effectively 256 after nr_vnodes size check, so the max vNUMA nodes. Thus vdistance table will need to allocate 2 pages of 4K size. Will be that viewed as a potential candidate to a list of affected hypercalls in XSA-77? > >> +static long vnuma_fallback(const struct domain *d, >> + struct vnuma_info **vnuma) >> +{ >> + struct vnuma_info *v; >> + long ret; >> + >> + >> + /* Will not destroy vNUMA here, destroy before calling this. */ >> + if ( vnuma && *vnuma ) >> + return -EINVAL; >> + >> + v = *vnuma; >> + ret = vnuma_alloc(&v, 1, d->max_vcpus, 1); >> + if ( ret ) >> + return ret; >> + >> + v->vmemrange[0].start = 0; >> + v->vmemrange[0].end = d->max_pages << PAGE_SHIFT; > > Didn't we settle on using inclusive ranges to avoid problems at the > address space end? Or was that in the context of some other series > (likely by someone else)? Yes, I think it was from Arianna. http://lists.xen.org/archives/html/xen-devel/2014-04/msg02641.html > >> +long vnuma_init(const struct xen_domctl_vnuma *u_vnuma, >> + const struct domain *d, >> + struct vnuma_info **dst) >> +{ >> + unsigned int dist_size, nr_vnodes = 0; >> + long ret; >> + struct vnuma_info *v = NULL; >> + >> + ret = -EINVAL; >> + >> + /* If vNUMA topology already set, just exit. */ >> + if ( !u_vnuma || *dst ) >> + return ret; >> + >> + nr_vnodes = u_vnuma->nr_vnodes; >> + >> + if ( nr_vnodes == 0 ) >> + return ret; >> + >> + if ( nr_vnodes > (UINT_MAX / nr_vnodes) ) >> + return ret; >> + >> + dist_size = nr_vnodes * nr_vnodes; >> + >> + ret = vnuma_alloc(&v, nr_vnodes, d->max_vcpus, dist_size); >> + if ( ret ) >> + return ret; >> + >> + /* On failure, set only one vNUMA node and its success. */ >> + ret = 0; > > Pointless - just use "return 0" below. > >> + >> + if ( copy_from_guest(v->vdistance, u_vnuma->vdistance, dist_size) ) >> + goto vnuma_onenode; >> + if ( copy_from_guest(v->vmemrange, u_vnuma->vmemrange, nr_vnodes) ) >> + goto vnuma_onenode; >> + if ( copy_from_guest(v->vcpu_to_vnode, u_vnuma->vcpu_to_vnode, >> + d->max_vcpus) ) > > Indentation. > >> + goto vnuma_onenode; >> + if ( copy_from_guest(v->vnode_to_pnode, u_vnuma->vnode_to_pnode, >> + nr_vnodes) ) > > Again. > >> + goto vnuma_onenode; >> + >> + v->nr_vnodes = nr_vnodes; >> + *dst = v; >> + >> + return ret; >> + >> +vnuma_onenode: > > Labels are to be indented by at least one space. > >> + case XEN_DOMCTL_setvnumainfo: >> + { >> + struct vnuma_info *v = NULL; >> + >> + ret = -EFAULT; >> + if ( guest_handle_is_null(op->u.vnuma.vdistance) || >> + guest_handle_is_null(op->u.vnuma.vmemrange) || >> + guest_handle_is_null(op->u.vnuma.vcpu_to_vnode) || >> + guest_handle_is_null(op->u.vnuma.vnode_to_pnode) ) >> + return ret; >> + >> + ret = -EINVAL; >> + >> + ret = vnuma_init(&op->u.vnuma, d, &v); >> + if ( ret < 0 || v == NULL ) > > So when v == NULL you return success (ret being 0)? That second > check is either pointless (could become an ASSERT()) or needs proper > handling. > >> + break; >> + >> + /* overwrite vnuma for domain */ >> + if ( !d->vnuma ) >> + vnuma_destroy(d->vnuma); >> + >> + domain_lock(d); >> + d->vnuma = v; >> + domain_unlock(d); > > domain_lock()? What does this guard against? (We generally aim at > removing uses of domain_lock() rather than adding new ones.) > >> + case XENMEM_get_vnumainfo: >> + { >> + struct vnuma_topology_info topology; >> + struct domain *d; >> + unsigned int dom_vnodes = 0; > > Pointless initializer. > >> + >> + /* >> + * guest passes nr_vnodes and nr_vcpus thus >> + * we know how much memory guest has allocated. >> + */ >> + if ( copy_from_guest(&topology, arg, 1) || >> + guest_handle_is_null(topology.vmemrange.h) || >> + guest_handle_is_null(topology.vdistance.h) || >> + guest_handle_is_null(topology.vcpu_to_vnode.h) ) >> + return -EFAULT; >> + >> + if ( (d = rcu_lock_domain_by_any_id(topology.domid)) == NULL ) >> + return -ESRCH; >> + >> + rc = -EOPNOTSUPP; >> + if ( d->vnuma == NULL ) >> + goto vnumainfo_out; >> + >> + if ( d->vnuma->nr_vnodes == 0 ) >> + goto vnumainfo_out; > > Can this second condition validly (other than due to a race) be true if > the first one wasn't? (And of course there's synchronization missing > here, to avoid the race.) My idea of using pair domain_lock and rcu_lock_domain_by_any_id was to avoid that race. I used domain_lock in domctl hypercall when the pointer to vnuma of a domain is being set. XENMEM_get_vnumainfo reads the values and hold the reader lock of that domain. As setting vnuma happens once on booting domain, domain_lock seemed to be ok here. Would be a spinlock more appropriate here? > >> + >> + dom_vnodes = d->vnuma->nr_vnodes; >> + >> + /* >> + * guest nr_cpus and nr_nodes may differ from domain vnuma config. >> + * Check here guest nr_nodes and nr_cpus to make sure we dont >> overflow. >> + */ >> + rc = -ENOBUFS; >> + if ( topology.nr_vnodes < dom_vnodes || >> + topology.nr_vcpus < d->max_vcpus ) >> + goto vnumainfo_out; > > You ought to be passing back the needed values in the structure fields. Understood. > >> + >> + rc = -EFAULT; >> + >> + if ( copy_to_guest(topology.vmemrange.h, d->vnuma->vmemrange, >> + dom_vnodes) != 0 ) >> + goto vnumainfo_out; >> + >> + if ( copy_to_guest(topology.vdistance.h, d->vnuma->vdistance, >> + dom_vnodes * dom_vnodes) != 0 ) >> + goto vnumainfo_out; >> + >> + if ( copy_to_guest(topology.vcpu_to_vnode.h, >> d->vnuma->vcpu_to_vnode, >> + d->max_vcpus) != 0 ) >> + goto vnumainfo_out; >> + >> + topology.nr_vnodes = dom_vnodes; > > And why not topology.nr_vcpus? Yes, I missed that one. > >> + >> + if ( copy_to_guest(arg, &topology, 1) != 0 ) > > __copy_to_guest() please. > > Jan -- Elena _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.