[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 2/9] x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas



>>> Roger Pau Monne <roger.pau@xxxxxxxxxx> 06/30/17 5:02 PM >>>
> @@ -1041,6 +1043,24 @@ static int __init pvh_setup_acpi(struct domain *d, 
> paddr_t start_info)
>      return 0;
>  }
>  
> +int __init pvh_setup_mmcfg(struct domain *d)

Didn't I point out that __init van't be correct here, and instead this
needs to be __hwdom_init? I can see that the only current caller is
__init, but that merely suggests there is a second call missing.

> +{
> +    unsigned int i;
> +    int rc;
> +
> +    for ( i = 0; i < pci_mmcfg_config_num; i++ )
> +    {
> +        rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address,
> +                                         
> pci_mmcfg_config[i].start_bus_number,
> +                                         pci_mmcfg_config[i].end_bus_number,
> +                                         pci_mmcfg_config[i].pci_segment);
> +        if ( rc )
> +            return rc;

I would make this a best effort thing, i.e. issue a log message upon
failure but continue the loop. There's a good chance Dom0 will still
be able to come up.

> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -261,11 +261,11 @@ void register_g2m_portio_handler(struct domain *d)
>  static int vpci_access_check(unsigned int reg, unsigned int len)
>  {
>      /* Check access size. */
> -    if ( len != 1 && len != 2 && len != 4 )
> +    if ( len != 1 && len != 2 && len != 4 && len != 8 )
>          return -EINVAL;
>  
> -    /* Check if access crosses a double-word boundary. */
> -    if ( (reg & 3) + len > 4 )
> +    /* Check if access crosses a double-word boundary or it's not aligned. */
> +    if ( (len <= 4 && (reg & 3) + len > 4) || (len == 8 && (reg & 3) != 0) )
>          return -EINVAL;

For one I suppose you mean "& 7" in the 8-byte case. And then I don't
understand why you permit mis-aligned 2-byte writes, but not mis-aligned
4-byte ones as long as they fall withing a quad-word. Any such asymmetry
needs at least a comment.

> @@ -398,6 +398,188 @@ void register_vpci_portio_handler(struct domain *d)
>      handler->ops = &vpci_portio_ops;
>  }
>  
> +struct hvm_mmcfg {
> +    paddr_t addr;
> +    size_t size;

paddr_t and size_t don't really fit together, most notably on 32-bit.
As I don't think any individual range can possibly be 4Gb or larger, I
think unsigned int would suffice here.

> +    unsigned int bus;
> +    unsigned int segment;

Depending on how many instances of this structure we expect, it may be
worthwhile to limit these two to 8 and 16 bits respectively.

> +/* Handlers to trap PCI ECAM config accesses. */

An "ECAM" did survive here.

> +static const struct hvm_mmcfg *vpci_mmcfg_find(struct domain *d,
> +                                               unsigned long addr)

paddr_t (to match the structure field)

> +static void vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
> +                                   unsigned long addr, unsigned int *bus,

Same here (and it seems more below). Also, just like in patch 1,
perhaps return the register by value rather than via indirection.

> +                                   unsigned int *slot, unsigned int *func,
> +                                   unsigned int *reg)
> +{
> +    addr -= mmcfg->addr;
> +    *bus = ((addr >> 20) & 0xff) + mmcfg->bus;
> +    *slot = (addr >> 15) & 0x1f;
> +    *func = (addr >> 12) & 0x7;
> +    *reg = addr & 0xfff;

Iirc there already was a comment to use manifest constants or macros
here.

> +static int vpci_mmcfg_accept(struct vcpu *v, unsigned long addr)
> +{
> +    struct domain *d = v->domain;
> +    bool found;
> +
> +    vpci_lock(d);
> +    found = vpci_mmcfg_find(d, addr);
> +    vpci_unlock(d);

The latest here I wonder whether the lock wouldn't better be an r/w one.

> +static int vpci_mmcfg_read(struct vcpu *v, unsigned long addr,
> +                           unsigned int len, unsigned long *data)

uint64_t * (to be 32-bit compatible)

> +{
> +    struct domain *d = v->domain;
> +   > +
> +    vpci_lock(d);
> +    mmcfg = vpci_mmcfg_find(d, addr);
> +    if ( !mmcfg )
> +    {
> +        vpci_unlock(d);
> +        return X86EMUL_OKAY;
> +    }
> +
> +    vpci_mmcfg_decode_addr(mmcfg, addr, &bus, &slot, &func, ®);
> +
> +    if ( vpci_access_check(reg, len) )
> +    {
> +        vpci_unlock(d);
> +        return X86EMUL_OKAY;
> +    }
> +
> +    pcidevs_lock();
> +    if ( len == 8 )
> +    {
> +        /*
> +         * According to the PCIe 3.1A specification:
> +         *  - Configuration Reads and Writes must usually be DWORD or smaller
> +         *    in size.
> +         *  - Because Root Complex implementations are not required to 
> support
> +         *    accesses to a RCRB that cross DW boundaries [...] software
> +         *    should take care not to cause the generation of such accesses
> +         *    when accessing a RCRB unless the Root Complex will support the
> +         *    access.
> +         *  Xen however supports 8byte accesses by splitting them into two
> +         *  4byte accesses.
> +         */
> +        *data = vpci_read(mmcfg->segment, bus, slot, func, reg, 4);
> +        *data |= (uint64_t)vpci_read(mmcfg->segment, bus, slot, func,
> +                                     reg + 4, 4) << 32;
> +    }
> +    else
> +        *data = vpci_read(mmcfg->segment, bus, slot, func, reg, len);

I think it would be preferable to avoid the else, by merging this and
the first of the other two reads.

> +    pcidevs_unlock();
> +    vpci_unlock(d);

Question on lock order (should have gone into the patch 1 reply already,
but I had thought of this only after sending): Is it really a good idea
to nest this way? The pcidevs lock is covering quite large regions at
times, so the risk of a lock order violation seems non-negligible even
if there may be none right now. Futhermore the new uses of the pcidevs
lock you introduce would seem to make it quite desirable to make that
one an r/w one too. Otoh that's a recursive one, so it'll be non-trivial
to convert ...

> +int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,

__hwdom_init

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.