Xen project Mailing List

Re: [Xen-devel] [PATCH v7 7/9] ioreq-server: add support for multiple servers

To: "Paul Durrant" <paul.durrant@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 12 May 2014 12:26:12 +0100

Cc: xen-devel@xxxxxxxxxxxxx, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

Delivery-date: Mon, 12 May 2014 11:26:42 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 09.05.14 at 10:40, <paul.durrant@xxxxxxxxxx> wrote: > +static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn) > +{ > + unsigned int i = gmfn - d->arch.hvm_domain.ioreq_gmfn.base; > + > + clear_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask); Perhaps worth an ASSERT() on i not being out of range, to avoid hard to debug memory corruption? > +static int hvm_access_cf8( > + int dir, uint32_t port, uint32_t bytes, uint32_t *val) > +{ > + struct domain *d = current->domain; > + > + if ( bytes != 4 ) > + return X86EMUL_UNHANDLEABLE; > + > + BUG_ON(port < 0xcf8); If at all, that would better be an ASSERT() imo. > + port -= 0xcf8; Stale statement - port isn't being used anymore below. > + > + if ( dir == IOREQ_WRITE ) > + { > + d->arch.hvm_domain.pci_cf8 = *val; > + > + /* We always need to fall through to the catch all emulator */ > + return X86EMUL_UNHANDLEABLE; > + } > + else > + { > + *val = d->arch.hvm_domain.pci_cf8; > + return X86EMUL_OKAY; Is this correct irrespective of the value of the high bit? You don't really know what non-config-space accesses mean (including whether that would have any side effects), so I guess you'd be better of forwarding that case too. And in the end I wonder whether it wouldn't be better to forward all access, with this function solely serving the purpose of latching the last written value. > -static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s) > +static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s, > + bool_t is_default) > { > struct domain *d = s->domain; > - unsigned long pfn; > + unsigned long ioreq_pfn, bufioreq_pfn; > int rc; > > - pfn = d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]; > - rc = hvm_map_ioreq_page(s, 0, pfn); > + if ( is_default ) { > + ioreq_pfn = d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]; > + bufioreq_pfn = d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN]; > + } else { I hate to say that, but: Coding style. And still more of this further down. > +static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s, > + bool_t is_default) > +{ > + int i; And for variables like this I also already asked that they be unsigned (as properly done e.g. in hvm_ioreq_server_free_rangesets()). > + int rc; > + > + if ( is_default ) > + goto done; > + > + for ( i = 0; i < NR_IO_RANGE_TYPES; i++ ) { > + char *name; > + > + rc = asprintf(&name, "ioreq_server %d %s", s->id, > + (i == HVMOP_IO_RANGE_PORT) ? "port" : > + (i == HVMOP_IO_RANGE_MEMORY) ? "memory" : > + (i == HVMOP_IO_RANGE_PCI) ? "pci" : > + ""); > + if ( rc ) > + goto fail; > + > + s->range[i] = rangeset_new(s->domain, name, > + RANGESETF_prettyprint_hex); > + > + xfree(name); > + > + rc = -ENOMEM; > + if ( !s->range[i] ) > + goto fail; > + > + rangeset_limit(s->range[i], MAX_NR_IO_RANGES); Just as a remark/question: Perhaps ultimately this limit should become a configurable one, at once allowing the limits for the three kinds to be different? > +static ioservid_t next_ioservid(struct domain *d) > +{ > + struct hvm_ioreq_server *s; > + static ioservid_t id; > + > + ASSERT(spin_is_locked(&d->arch.hvm_domain.ioreq_server.lock)); > + > + id++; > + > + again: > + /* Check for uniqueness */ > + list_for_each_entry ( s, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + if (id == s->id) > + { > + id++; > + goto again; Please put the label ahead of the pre-loop increment, avoiding the duplication. And what's worse - without latching the value of "id" into a local variable (storing back only the final result), your uniqueness check doesn't work, because the lock only guards against concurrent accesses for a single domain. Once fixed, perhaps also add a brief comment explaining why the remaining not fully serialized access to "id" is not a problem? And of course, if you're going to stay with this system-wide static variable, you'd need to clarify why you think this is not a - how ever small - information leak (of the side channel kind). > @@ -757,25 +942,35 @@ static int hvm_create_ioreq_server(struct domain *d, > domid_t domid) > goto fail1; > > domain_pause(d); > - spin_lock(&d->arch.hvm_domain.ioreq_server_lock); > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock); > > rc = -EEXIST; > - if ( d->arch.hvm_domain.ioreq_server != NULL ) > + if ( is_default && d->arch.hvm_domain.default_ioreq_server != NULL ) > goto fail2; > > - rc = hvm_ioreq_server_init(s, d, domid); > + rc = hvm_ioreq_server_init(s, d, domid, is_default, > + next_ioservid(d)); > if ( rc ) > - goto fail2; > + goto fail3; > + > + list_add(&s->list_entry, > + &d->arch.hvm_domain.ioreq_server.list); > + d->arch.hvm_domain.ioreq_server.count++; Do you really need that field? The only time it's being read seems to be to determine whether there's exactly one of them. That surely can also be determined by looking at .list. > +static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id, > + uint32_t type, uint64_t start, > uint64_t end) > +{ > + struct hvm_ioreq_server *s; > + int rc; > + > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock); > + > + rc = -ENOENT; > + list_for_each_entry ( s, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + if ( s == d->arch.hvm_domain.default_ioreq_server ) > + continue; > + > + if ( s->id == id ) > + { > + struct rangeset *r; > + > + switch ( type ) > + { > + case HVMOP_IO_RANGE_PORT: > + case HVMOP_IO_RANGE_MEMORY: > + case HVMOP_IO_RANGE_PCI: > + r = s->range[type]; > + break; > + > + default: > + r = NULL; > + break; > + } > + > + rc = -EINVAL; > + if ( !r ) > + break; > + > + rc = rangeset_add_range(r, start, end); Is it not an error for the range to perhaps already be there ... > +static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t > id, > + uint32_t type, uint64_t > start, uint64_t end) > +{ > + struct hvm_ioreq_server *s; > + int rc; > + > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock); > + > + rc = -ENOENT; > + list_for_each_entry ( s, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + if ( s == d->arch.hvm_domain.default_ioreq_server ) > + continue; > + > + if ( s->id == id ) > + { > + struct rangeset *r; > + > + switch ( type ) > + { > + case HVMOP_IO_RANGE_PORT: > + case HVMOP_IO_RANGE_MEMORY: > + case HVMOP_IO_RANGE_PCI: > + r = s->range[type]; > + break; > + > + default: > + r = NULL; > + break; > + } > + > + rc = -EINVAL; > + if ( !r ) > + break; > + > + rc = rangeset_remove_range(r, start, end); ... or to not be there here? > +static int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v) > +{ > + struct hvm_ioreq_server *s; > + int rc; > + > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock); > + > + list_for_each_entry ( s, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + bool_t is_default = ( s == d->arch.hvm_domain.default_ioreq_server); Stray blank after (, also again further down. > +static void hvm_destroy_all_ioreq_servers(struct domain *d) > +{ > + struct hvm_ioreq_server *s, *next; > + > + spin_lock(&d->arch.hvm_domain.ioreq_server.lock); > + > + list_for_each_entry_safe ( s, > + next, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + bool_t is_default = ( s == d->arch.hvm_domain.default_ioreq_server); > + > + domain_pause(d); This gets called only during domain teardown - what's the point of the domain_pause()? > +static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d, > + ioreq_t *p) > +{ > +#define CF8_BDF(cf8) (((cf8) & 0x00ffff00) >> 8) > +#define CF8_ADDR_LO(cf8) ((cf8) & 0x000000fc) > +#define CF8_ADDR_HI(cf8) (((cf8) & 0x0f000000) >> 16) Any reason you exclude the low two bits here, despite separately or-ing them back in below? > +#define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000)) > + > + struct hvm_ioreq_server *s; > + uint32_t cf8; > + uint8_t type; > + uint64_t addr; > + > + if ( d->arch.hvm_domain.ioreq_server.count <= 1 || > + (p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO) ) > + return d->arch.hvm_domain.default_ioreq_server; > + > + cf8 = d->arch.hvm_domain.pci_cf8; > + > + if ( p->type == IOREQ_TYPE_PIO && > + (p->addr & ~3) == 0xcfc && > + CF8_ENABLED(cf8) ) > + { > + uint32_t sbdf; > + > + /* PCI config data cycle */ > + > + sbdf = HVMOP_PCI_SBDF(0, > + PCI_BUS(CF8_BDF(cf8)), > + PCI_SLOT(CF8_BDF(cf8)), > + PCI_FUNC(CF8_BDF(cf8))); > + > + type = IOREQ_TYPE_PCI_CONFIG; > + addr = ((uint64_t)sbdf << 32) | > + CF8_ADDR_HI(cf8) | > + CF8_ADDR_LO(cf8) | > + (p->addr & 3); > + } > + else > + { > + type = p->type; I know I asked this before, and iirc your answer was "no": Is IOREQ_TYPE_PCI_CONFIG coming in here valid? > + addr = p->addr; > + } > + > + list_for_each_entry ( s, > + &d->arch.hvm_domain.ioreq_server.list, > + list_entry ) > + { > + struct rangeset *r; > + > + if ( s == d->arch.hvm_domain.default_ioreq_server ) > + continue; > + > + BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT); > + BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY); > + BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI); > + r = s->range[type]; > + > + switch ( type ) > + { > + case IOREQ_TYPE_PIO: > + case IOREQ_TYPE_COPY: > + if ( rangeset_contains_singleton(r, addr) ) While the PCI check below certainly can be a singleton one, I don't think you can do so here: In order for a request to be forwarded, the full spanned range should be owned by the respective server. And yes, consideration how to deal with split accesses is going to be needed I'm afraid. > +static int hvmop_create_ioreq_server( > + XEN_GUEST_HANDLE_PARAM(xen_hvm_create_ioreq_server_t) uop) > +{ > + struct domain *curr_d = current->domain; > + xen_hvm_create_ioreq_server_t op; > + struct domain *d; > + int rc; > + > + if ( copy_from_guest(&op, uop, 1) ) > + return -EFAULT; > + > + rc = rcu_lock_remote_domain_by_id(op.domid, &d); > + if ( rc != 0 ) > + return rc; > + > + rc = -EINVAL; > + if ( !is_hvm_domain(d) ) > + goto out; > + > + rc = xsm_hvm_ioreq_server(XSM_DM_PRIV, d); This call is indistinguishable from the four other ones further down. You ought to let XSM know what operation it really is that you're about to perform. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.