[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/9] xen/vpci: introduce basic handlers to trap accesses to the PCI config space



>>> On 29.05.17 at 17:05, <roger.pau@xxxxxxxxxx> wrote:
> On Mon, May 29, 2017 at 08:16:41AM -0600, Jan Beulich wrote:
>> >>> On 29.05.17 at 14:57, <roger.pau@xxxxxxxxxx> wrote:
>> > On Fri, May 19, 2017 at 05:35:47AM -0600, Jan Beulich wrote:
>> >> >>> On 27.04.17 at 16:35, <roger.pau@xxxxxxxxxx> wrote:
>> >> > +int xen_vpci_add_register(struct pci_dev *pdev, vpci_read_t 
>> >> > read_handler,
>> 
>> Btw., I only now notice this further strange xen_ prefix here.
> 
> I assume this should be vpci_*, (dropping the xen_ prefix uniformly).

Yes please.

>> >> > +                          vpci_write_t write_handler, unsigned int 
>> >> > offset,
>> >> > +                          unsigned int size, void *data)
>> >> > +{
>> >> > +    struct rb_node **new, *parent;
>> >> > +    struct vpci_register *r;
>> >> > +
>> >> > +    /* Some sanity checks. */
>> >> > +    if ( (size != 1 && size != 2 && size != 4) || offset >= 0xFFF ||
>> >> 
>> >> Off by one again in the offset check.
>> > 
>> > Fixed to be > 0xfff. Should this maybe be added to pci_regs.h as
>> > PCI_MAX_REGISTER? 
>> 
>> Could be done, but then we need one for base and one for
>> extended config space. May want to check whether Linux has
>> invented some good names for these by now.
> 
> pci_regs.h from Linux now has:
> 
> /*
>  * Conventional PCI and PCI-X Mode 1 devices have 256 bytes of
>  * configuration space.  PCI-X Mode 2 and PCIe devices have 4096 bytes of
>  * configuration space.
>  */
> #define PCI_CFG_SPACE_SIZE    256
> #define PCI_CFG_SPACE_EXP_SIZE        4096
> 
> At the top. Do you think those defines are fine for importing?

Sure.

> (this
> was introduced by cc10385b6fde3, but I don't think importing this in a
> more formal way makes sense).

Agreed.

>> >> > +/* Helper macros for the read/write handlers. */
>> >> > +#define GENMASK_BYTES(e, s) GENMASK((e) * 8, (s) * 8)
>> >> 
>> >> What do e and s stand for here?
>> > 
>> > e = end, s = start (in bytes).
>> 
>> And where do you start counting. Having seen the rest of the
>> series I'm actually rather unconvinced use these macros results
>> in better code - to me, plain hex numbers are quite a bit easier
>> to read as long as the number of digits doesn't go meaningfully
>> beyond 10 or so.
>> 
>> >> > +#define SHIFT_RIGHT_BYTES(d, o) d >>= (o) * 8
>> >> 
>> >> And at least o here?
>> > 
>> > o = offset (in bytes)
>> 
>> I think simply writhing the shift expression is once again more
>> clear to the reader than using a macro which is longer to read
>> and type and which has semantics which aren't immediately
>> clear from its name.
>> 
>> > I can rename ADD_RESULT to APPEND_RESULT or something more descriptive
>> > if you think it's going to make it easier to understand.
>> 
>> I'd prefer if the name "merge" appeared in that name - I don't see
>> this being usable strictly only to append to either side of a value.
> 
> OK MERGE_RESULT or MERGE_REGISTER maybe? (and the rest removed).

Either name is fine with me, with a slight preference to the former.

>> >> > +static int vpci_write_helper(struct pci_dev *pdev,
>> >> > +                             const struct vpci_register *r, unsigned 
>> >> > int size,
>> >> > +                             unsigned int offset, uint32_t data)
>> >> > +{
>> >> > +    union vpci_val val = { .double_word = data };
>> >> > +    int rc;
>> >> > +
>> >> > +    ASSERT(size <= r->size);
>> >> > +    if ( size != r->size )
>> >> > +    {
>> >> > +        rc = r->read(pdev, r->offset, &val, r->priv_data);
>> >> > +        if ( rc )
>> >> > +            return rc;
>> >> > +        val.double_word &= ~GENMASK_BYTES(size + offset, offset);
>> >> > +        data &= GENMASK_BYTES(size, 0);
>> >> > +        val.double_word |= data << (offset * 8);
>> >> > +    }
>> >> > +
>> >> > +    return r->write(pdev, r->offset, val, r->priv_data);
>> >> > +}
>> >> 
>> >> I'm not sure that writing back the value read is correct in all cases
>> >> (think of write-only or rw1c registers or even offsets where reads
>> >> and writes access different registers altogether). I think the write
>> >> handlers will need to be made capable of dealing with partial writes.
>> > 
>> > That seems to be what pciback does fro writes: read, merge value,
>> > write back (drivers/xen/xen-pciback/conf_space.c
>> > xen_pcibk_config_write):
>> > 
>> > err = conf_space_read(dev, cfg_entry, field_start,
>> >                  &tmp_val);
>> > if (err)
>> >    break;
>> > 
>> > tmp_val = merge_value(tmp_val, value, get_mask(size),
>> >                  offset - field_start);
>> > 
>> > err = conf_space_write(dev, cfg_entry, field_start,
>> >                   tmp_val);
>> > 
>> > I would prefer to do it this way in order to avoid adding more
>> > complexity to the handlers themselves. So far I haven't found any such
>> > registers (rw1c) in the PCI config space, do you have references to
>> > any of them?
>> 
>> The status register.
> 
> But the status register is not going to be trapped, not by Dom0 or
> DomUs? None of the registers that I've emulated for the header, or the
> capabilities behave this way, and adding such and offset would
> greatly increase the complexity of each handler IMHO.
> 
> Maybe it would be easier to add a flag to mark rw1c registers as such,
> if that's ever needed? (and avoid the read in that case)

Well, yes, that's how qemu does it. Of course with the caveat that
you need to mark individual bits (again as qemu does). Of course,
as long as no such register is being emulated, leaving a prominent
comment would probably be an acceptable alternative to coding
all this right away.

>> >> > +/* Helpers for locking/unlocking. */
>> >> > +#define vpci_lock(d) spin_lock(&(d)->arch.hvm_domain.vpci_lock)
>> >> > +#define vpci_unlock(d) spin_unlock(&(d)->arch.hvm_domain.vpci_lock)
>> >> > +#define vpci_locked(d) spin_is_locked(&(d)->arch.hvm_domain.vpci_lock)
>> >> 
>> >> While for the code layering you don't need recursive locks, did you
>> >> consider using them nevertheless so that spin_is_locked() return
>> >> values are actually meaningful for your purposes?
>> > 
>> > I'm not sure I follow, spin_is_locked already return meaningful values
>> > for my purpose AFAICT.
>> 
>> For non-recursive locks this tells you whether _any_ CPU holds
>> the lock, yet normally you want to know whether the CPU you
>> run on does.
> 
> Indeed, so if I make the lock recursive spin_is_locked is going to
> return whether the current CPU holds the lock. That's kind of
> counter-intuitive.

Counter-intuitive or not, it's a result of non-recursive locks being
more slim (and hence slightly faster). And as long as
spin_is_locked() is being used in ASSERT()s only, it's better than
no sanity checking at all.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.