[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 7/9] ioreq-server: add support for multiple servers



>>> On 09.05.14 at 10:40, <paul.durrant@xxxxxxxxxx> wrote:
> +static void hvm_free_ioreq_gmfn(struct domain *d, unsigned long gmfn)
> +{
> +    unsigned int i = gmfn - d->arch.hvm_domain.ioreq_gmfn.base;
> +
> +    clear_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);

Perhaps worth an ASSERT() on i not being out of range, to avoid
hard to debug memory corruption?

> +static int hvm_access_cf8(
> +    int dir, uint32_t port, uint32_t bytes, uint32_t *val)
> +{
> +    struct domain *d = current->domain;
> +
> +    if ( bytes != 4 )
> +        return X86EMUL_UNHANDLEABLE;
> +
> +    BUG_ON(port < 0xcf8);

If at all, that would better be an ASSERT() imo.

> +    port -= 0xcf8;

Stale statement - port isn't being used anymore below.

> +
> +    if ( dir == IOREQ_WRITE )
> +    {
> +        d->arch.hvm_domain.pci_cf8 = *val;
> +
> +        /* We always need to fall through to the catch all emulator */
> +        return X86EMUL_UNHANDLEABLE;
> +    }
> +    else
> +    {
> +        *val = d->arch.hvm_domain.pci_cf8;
> +        return X86EMUL_OKAY;

Is this correct irrespective of the value of the high bit? You don't
really know what non-config-space accesses mean (including whether
that would have any side effects), so I guess you'd be better of
forwarding that case too.

And in the end I wonder whether it wouldn't be better to forward
all access, with this function solely serving the purpose of latching
the last written value.

> -static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s)
> +static int hvm_ioreq_server_map_pages(struct hvm_ioreq_server *s,
> +                                      bool_t is_default)
>  {
>      struct domain *d = s->domain;
> -    unsigned long pfn;
> +    unsigned long ioreq_pfn, bufioreq_pfn;
>      int rc;
>  
> -    pfn = d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
> -    rc = hvm_map_ioreq_page(s, 0, pfn);
> +    if ( is_default ) {
> +        ioreq_pfn = d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
> +        bufioreq_pfn = d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN];
> +    } else {

I hate to say that, but: Coding style. And still more of this further
down.

> +static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s, 
> +                                            bool_t is_default)
> +{
> +    int i;

And for variables like this I also already asked that they be unsigned
(as properly done e.g. in hvm_ioreq_server_free_rangesets()).

> +    int rc;
> +
> +    if ( is_default )
> +        goto done;
> +
> +    for ( i = 0; i < NR_IO_RANGE_TYPES; i++ ) {
> +        char *name;
> +
> +        rc = asprintf(&name, "ioreq_server %d %s", s->id,
> +                      (i == HVMOP_IO_RANGE_PORT) ? "port" :
> +                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
> +                      (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> +                      "");
> +        if ( rc )
> +            goto fail;
> +
> +        s->range[i] = rangeset_new(s->domain, name,
> +                                   RANGESETF_prettyprint_hex);
> +
> +        xfree(name);
> +
> +        rc = -ENOMEM;
> +        if ( !s->range[i] )
> +            goto fail;
> +
> +        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);

Just as a remark/question: Perhaps ultimately this limit should
become a configurable one, at once allowing the limits for the three
kinds to be different?

> +static ioservid_t next_ioservid(struct domain *d)
> +{
> +    struct hvm_ioreq_server *s;
> +    static ioservid_t id;
> +
> +    ASSERT(spin_is_locked(&d->arch.hvm_domain.ioreq_server.lock));
> +
> +    id++;
> +
> + again:
> +    /* Check for uniqueness */
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if (id == s->id)
> +        {
> +            id++;
> +            goto again;

Please put the label ahead of the pre-loop increment, avoiding the
duplication.

And what's worse - without latching the value of "id" into a local
variable (storing back only the final result), your uniqueness check
doesn't work, because the lock only guards against concurrent
accesses for a single domain. Once fixed, perhaps also add a brief
comment explaining why the remaining not fully serialized access to
"id" is not a problem?

And of course, if you're going to stay with this system-wide static
variable, you'd need to clarify why you think this is not a - how
ever small - information leak (of the side channel kind).

> @@ -757,25 +942,35 @@ static int hvm_create_ioreq_server(struct domain *d, 
> domid_t domid)
>          goto fail1;
>  
>      domain_pause(d);
> -    spin_lock(&d->arch.hvm_domain.ioreq_server_lock);
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
>  
>      rc = -EEXIST;
> -    if ( d->arch.hvm_domain.ioreq_server != NULL )
> +    if ( is_default && d->arch.hvm_domain.default_ioreq_server != NULL )
>          goto fail2;
>  
> -    rc = hvm_ioreq_server_init(s, d, domid);
> +    rc = hvm_ioreq_server_init(s, d, domid, is_default,
> +                               next_ioservid(d));
>      if ( rc )
> -        goto fail2;
> +        goto fail3;
> +
> +    list_add(&s->list_entry,
> +             &d->arch.hvm_domain.ioreq_server.list);
> +    d->arch.hvm_domain.ioreq_server.count++;

Do you really need that field? The only time it's being read seems to be
to determine whether there's exactly one of them. That surely can also
be determined by looking at .list.

> +static int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> +                                            uint32_t type, uint64_t start, 
> uint64_t end)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        if ( s->id == id )
> +        {
> +            struct rangeset *r;
> +
> +            switch ( type )
> +            {
> +            case HVMOP_IO_RANGE_PORT:
> +            case HVMOP_IO_RANGE_MEMORY:
> +            case HVMOP_IO_RANGE_PCI:
> +                r = s->range[type];
> +                break;
> +
> +            default:
> +                r = NULL;
> +                break;
> +            }
> +
> +            rc = -EINVAL;
> +            if ( !r )
> +                break;
> +
> +            rc = rangeset_add_range(r, start, end);

Is it not an error for the range to perhaps already be there ...

> +static int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t 
> id,
> +                                                uint32_t type, uint64_t 
> start, uint64_t end)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        if ( s->id == id )
> +        {
> +            struct rangeset *r;
> +
> +            switch ( type )
> +            {
> +            case HVMOP_IO_RANGE_PORT:
> +            case HVMOP_IO_RANGE_MEMORY:
> +            case HVMOP_IO_RANGE_PCI:
> +                r = s->range[type];
> +                break;
> +
> +            default:
> +                r = NULL;
> +                break;
> +            }
> +
> +            rc = -EINVAL;
> +            if ( !r )
> +                break;
> +
> +            rc = rangeset_remove_range(r, start, end);

... or to not be there here?

> +static int hvm_all_ioreq_servers_add_vcpu(struct domain *d, struct vcpu *v)
> +{
> +    struct hvm_ioreq_server *s;
> +    int rc;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        bool_t is_default = ( s == d->arch.hvm_domain.default_ioreq_server);

Stray blank after (, also again further down.

> +static void hvm_destroy_all_ioreq_servers(struct domain *d)
> +{
> +    struct hvm_ioreq_server *s, *next;
> +
> +    spin_lock(&d->arch.hvm_domain.ioreq_server.lock);
> +
> +    list_for_each_entry_safe ( s,
> +                               next,
> +                               &d->arch.hvm_domain.ioreq_server.list,
> +                               list_entry )
> +    {
> +        bool_t is_default = ( s == d->arch.hvm_domain.default_ioreq_server);
> +
> +        domain_pause(d);

This gets called only during domain teardown - what's the point of
the domain_pause()?

> +static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
> +                                                        ioreq_t *p)
> +{
> +#define CF8_BDF(cf8)     (((cf8) & 0x00ffff00) >> 8)
> +#define CF8_ADDR_LO(cf8) ((cf8) & 0x000000fc)
> +#define CF8_ADDR_HI(cf8) (((cf8) & 0x0f000000) >> 16)

Any reason you exclude the low two bits here, despite separately
or-ing them back in below?

> +#define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000))
> +
> +    struct hvm_ioreq_server *s;
> +    uint32_t cf8;
> +    uint8_t type;
> +    uint64_t addr;
> +
> +    if ( d->arch.hvm_domain.ioreq_server.count <= 1 ||
> +         (p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO) )
> +        return d->arch.hvm_domain.default_ioreq_server;
> +
> +    cf8 = d->arch.hvm_domain.pci_cf8;
> +        
> +    if ( p->type == IOREQ_TYPE_PIO &&
> +         (p->addr & ~3) == 0xcfc &&
> +         CF8_ENABLED(cf8) )
> +    {
> +        uint32_t sbdf;
> +
> +        /* PCI config data cycle */
> +
> +        sbdf = HVMOP_PCI_SBDF(0,
> +                              PCI_BUS(CF8_BDF(cf8)),
> +                              PCI_SLOT(CF8_BDF(cf8)),
> +                              PCI_FUNC(CF8_BDF(cf8)));
> +
> +        type = IOREQ_TYPE_PCI_CONFIG;
> +        addr = ((uint64_t)sbdf << 32) |
> +               CF8_ADDR_HI(cf8) |
> +               CF8_ADDR_LO(cf8) |
> +               (p->addr & 3);
> +    }
> +    else
> +    {
> +        type = p->type;

I know I asked this before, and iirc your answer was "no": Is
IOREQ_TYPE_PCI_CONFIG coming in here valid?

> +        addr = p->addr;
> +    }
> +
> +    list_for_each_entry ( s,
> +                          &d->arch.hvm_domain.ioreq_server.list,
> +                          list_entry )
> +    {
> +        struct rangeset *r;
> +
> +        if ( s == d->arch.hvm_domain.default_ioreq_server )
> +            continue;
> +
> +        BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
> +        BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
> +        BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
> +        r = s->range[type];
> +
> +        switch ( type )
> +        {
> +        case IOREQ_TYPE_PIO:
> +        case IOREQ_TYPE_COPY:
> +            if ( rangeset_contains_singleton(r, addr) )

While the PCI check below certainly can be a singleton one, I don't
think you can do so here: In order for a request to be forwarded,
the full spanned range should be owned by the respective server.
And yes, consideration how to deal with split accesses is going to be
needed I'm afraid.

> +static int hvmop_create_ioreq_server(
> +    XEN_GUEST_HANDLE_PARAM(xen_hvm_create_ioreq_server_t) uop)
> +{
> +    struct domain *curr_d = current->domain;
> +    xen_hvm_create_ioreq_server_t op;
> +    struct domain *d;
> +    int rc;
> +
> +    if ( copy_from_guest(&op, uop, 1) )
> +        return -EFAULT;
> +
> +    rc = rcu_lock_remote_domain_by_id(op.domid, &d);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    rc = -EINVAL;
> +    if ( !is_hvm_domain(d) )
> +        goto out;
> +
> +    rc = xsm_hvm_ioreq_server(XSM_DM_PRIV, d);

This call is indistinguishable from the four other ones further down.
You ought to let XSM know what operation it really is that you're
about to perform.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.