[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 02/11] vpci: introduce basic handlers to trap accesses to the PCI config space



On Thu, Sep 21, 2017 at 02:36:25PM +0100, Wei Liu wrote:
> On Tue, Sep 19, 2017 at 04:29:27PM +0100, Roger Pau Monne wrote:
> > This functionality is going to reside in vpci.c (and the corresponding
> > vpci.h header), and should be arch-agnostic. The handlers introduced
> > in this patch setup the basic functionality required in order to trap
> > accesses to the PCI config space, and allow decoding the address and
> > finding the corresponding handler that should handle the access
> > (although no handlers are implemented).
> > 
> > Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are
> > setup inside of a x86 HVM file, since that's not shared with other
> > arches.
> > 
> > A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen
> > whether a domain should use the newly introduced vPCI handlers, this
> > is only enabled for PVH Dom0 at the moment.
> > 
> > A very simple user-space test is also provided, so that the basic
> > functionality of the vPCI traps can be asserted. This has been proven
> > quite helpful during development, since the logic to handle partial
> > accesses or accesses that expand across multiple registers is not
> > trivial.
> > 
> > The handlers for the registers are added to a linked list that's keep
> > sorted at all times. Both the read and write handlers support accesses
> > that expand across multiple emulated registers and contain gaps not
> > emulated.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> I am afraid I don't know much about PCI so I can't do meaningful review
> of this patch.
> 
> The change to libxl looks good to me so for that bit:
> 
> Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>

Thanks.

> > ---
> > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
> > Cc: Jan Beulich <jbeulich@xxxxxxxx>
> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > Cc: Paul Durrant <paul.durrant@xxxxxxxxxx>
> > ---
> > Changes since v5:
> >  - Use a spinlock per pci device.
> >  - Use the recently introduced pci_sbdf_t type.
> >  - Fix test harness to use the right handler type and the newly
> >    introduced lock.
> >  - Move the position of the vpci sections in the linker scripts.
> >  - Constify domain and pci_dev in vpci_{read/write}.
> >  - Fix typos in comments.
> >  - Use _XEN_VPCI_H_ as header guard.
> > 
> > Changes since v4:
> > * User-space test harness:
> >  - Do not redirect the output of the test.
> >  - Add main.c and emul.h as dependencies of the Makefile target.
> >  - Use the same rule to modify the vpci and list headers.
> >  - Remove underscores from local macro variables.
> >  - Add _check suffix to the test harness multiread function.
> >  - Change the value written by every different size in the multiwrite
> >    test.
> >  - Use { } to initialize the r16 and r20 arrays (instead of { 0 }).
> >  - Perform some of the read checks with the local variable directly.
> >  - Expand some comments.
> >  - Implement a dummy rwlock.
> > * Hypervisor code:
> >  - Guard the linker script changes with CONFIG_HAS_PCI.
> >  - Rename vpci_access_check to vpci_access_allowed and make it return
> >    bool.
> >  - Make hvm_pci_decode_addr return the register as return value.
> >  - Use ~3 instead of 0xfffc to remove the register offset when
> >    checking accesses to IO ports.
> >  - s/head/prev in vpci_add_register.
> >  - Add parentheses around & in vpci_add_register.
> >  - Fix register removal.
> >  - Change the BUGs in vpci_{read/write}_hw helpers to
> >    ASSERT_UNREACHABLE.
> >  - Make merge_result static and change the computation of the mask to
> >    avoid using a uint64_t.
> >  - Modify vpci_read to only read from hardware the not-emulated gaps.
> >  - Remove the vpci_val union and use a uint32_t instead.
> >  - Change handler read type to return a uint32_t instead of modifying
> >    a variable passed by reference.
> >  - Constify the data opaque parameter of read handlers.
> >  - Change the size parameter of the vpci_{read/write} functions to
> >    unsigned int.
> >  - Place the array of initialization handlers in init.rodata or
> >    .rodata depending on whether late-hwdom is enabled.
> >  - Remove the pci_devs lock, assume the Dom0 is well behaved and won't
> >    remove the device while trying to access it.
> >  - Change the recursive spinlock into a rw lock for performance
> >    reasons.
> > 
> > Changes since v3:
> > * User-space test harness:
> >  - Fix spaces in container_of macro.
> >  - Implement a dummy locking functions.
> >  - Remove 'current' macro make current a pointer to the statically
> >    allocated vpcu.
> >  - Remove unneeded parentheses in the pci_conf_readX macros.
> >  - Fix the name of the write test macro.
> >  - Remove the dummy EXPORT_SYMBOL macro (this was needed by the RB
> >    code only).
> >  - Import the max macro.
> >  - Test all possible read/write size combinations with all possible
> >    emulated register sizes.
> >  - Introduce a test for register removal.
> > * Hypervisor code:
> >  - Use a sorted list in order to store the config space handlers.
> >  - Remove some unneeded 'else' branches.
> >  - Make the IO port handlers always return X86EMUL_OKAY, and set the
> >    data to all 1's in case of read failure (write are simply ignored).
> >  - In hvm_select_ioreq_server reuse local variables when calling
> >    XEN_DMOP_PCI_SBDF.
> >  - Store the pointers to the initialization functions in the .rodata
> >    section.
> >  - Do not ignore the return value of xen_vpci_add_handlers in
> >    setup_one_hwdom_device.
> >  - Remove the vpci_init macro.
> >  - Do not hide the pointers inside of the vpci_{read/write}_t
> >    typedefs.
> >  - Rename priv_data to private in vpci_register.
> >  - Simplify checking for register overlap in vpci_register_cmp.
> >  - Check that the offset and the length match before removing a
> >    register in xen_vpci_remove_register.
> >  - Make vpci_read_hw return a value rather than storing it in a
> >    pointer passed by parameter.
> >  - Handler dispatcher functions vpci_{read/write} no longer return an
> >    error code, errors on reads/writes should be treated like hardware
> >    (writes ignored, reads return all 1's or garbage).
> >  - Make sure pcidevs is locked before calling pci_get_pdev_by_domain.
> >  - Use a recursive spinlock for the vpci lock, so that spin_is_locked
> >    checks that the current CPU is holding the lock.
> >  - Make the code less error-chatty by removing some of the printk's.
> >  - Pass the slot and the function as separate parameters to the
> >    handler dispatchers (instead of passing devfn).
> >  - Allow handlers to be registered with either a read or write
> >    function only, the missing handler will be replaced by a dummy
> >    handler (writes ignored, reads return 1's).
> >  - Introduce PCI_CFG_SPACE_* defines from Linux.
> >  - Simplify the handler dispatchers by removing the recursion, now the
> >    dispatchers iterate over the list of sorted handlers and call them
> >    in order.
> >  - Remove the GENMASK_BYTES, SHIFT_RIGHT_BYTES and ADD_RESULT macros,
> >    and instead provide a merge_result function in order to merge a
> >    register output into a partial result.
> >  - Rename the fields of the vpci_val union to u8/u16/u32.
> >  - Remove the return values from the read/write handlers, errors
> >    should be handled internally and signaled as would be done on
> >    native hardware.
> >  - Remove the usage of the GENMASK macro.
> > 
> > Changes since v2:
> >  - Generalize the PCI address decoding and use it for IOREQ code also.
> > 
> > Changes since v1:
> >  - Allow access to cross a word-boundary.
> >  - Add locking.
> >  - Add cleanup to xen_vpci_add_handlers in case of failure.
> > ---
> >  .gitignore                        |   3 +
> >  tools/libxl/libxl_x86.c           |   2 +-
> >  tools/tests/Makefile              |   1 +
> >  tools/tests/vpci/Makefile         |  37 ++++
> >  tools/tests/vpci/emul.h           | 133 +++++++++++
> >  tools/tests/vpci/main.c           | 308 ++++++++++++++++++++++++++
> >  xen/arch/arm/xen.lds.S            |  12 +
> >  xen/arch/x86/domain.c             |  18 +-
> >  xen/arch/x86/hvm/hvm.c            |   2 +
> >  xen/arch/x86/hvm/io.c             | 103 +++++++++
> >  xen/arch/x86/setup.c              |   3 +-
> >  xen/arch/x86/xen.lds.S            |  12 +
> >  xen/drivers/Makefile              |   2 +-
> >  xen/drivers/passthrough/pci.c     |   9 +-
> >  xen/drivers/vpci/Makefile         |   1 +
> >  xen/drivers/vpci/vpci.c           | 450 
> > ++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/domain.h      |   1 +
> >  xen/include/asm-x86/hvm/io.h      |   3 +
> >  xen/include/public/arch-x86/xen.h |   5 +-
> >  xen/include/xen/pci.h             |   3 +
> >  xen/include/xen/pci_regs.h        |   8 +
> >  xen/include/xen/vpci.h            |  54 +++++
> >  22 files changed, 1161 insertions(+), 9 deletions(-)
> >  create mode 100644 tools/tests/vpci/Makefile
> >  create mode 100644 tools/tests/vpci/emul.h
> >  create mode 100644 tools/tests/vpci/main.c
> >  create mode 100644 xen/drivers/vpci/Makefile
> >  create mode 100644 xen/drivers/vpci/vpci.c
> >  create mode 100644 xen/include/xen/vpci.h
> > 
> > diff --git a/.gitignore b/.gitignore
> > index cc16649457..1c670b27d7 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -240,6 +240,9 @@ tools/tests/regression/build/*
> >  tools/tests/regression/downloads/*
> >  tools/tests/mem-sharing/memshrtool
> >  tools/tests/mce-test/tools/xen-mceinj
> > +tools/tests/vpci/list.h
> > +tools/tests/vpci/vpci.[hc]
> > +tools/tests/vpci/test_vpci
> >  tools/xcutils/lsevtchn
> >  tools/xcutils/readnotes
> >  tools/xenbackendd/_paths.h
> > diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> > index 455f6f0bed..dd7fc78a99 100644
> > --- a/tools/libxl/libxl_x86.c
> > +++ b/tools/libxl/libxl_x86.c
> > @@ -11,7 +11,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
> >      if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM) {
> >          if (d_config->b_info.device_model_version !=
> >              LIBXL_DEVICE_MODEL_VERSION_NONE) {
> > -            xc_config->emulation_flags = XEN_X86_EMU_ALL;
> > +            xc_config->emulation_flags = (XEN_X86_EMU_ALL & 
> > ~XEN_X86_EMU_VPCI);
> >          } else if (libxl_defbool_val(d_config->b_info.u.hvm.apic)) {
> >              /*
> >               * HVM guests without device model may want
> > diff --git a/tools/tests/Makefile b/tools/tests/Makefile
> > index 7162945121..f6942a93fb 100644
> > --- a/tools/tests/Makefile
> > +++ b/tools/tests/Makefile
> > @@ -13,6 +13,7 @@ endif
> >  SUBDIRS-$(CONFIG_X86) += x86_emulator
> >  SUBDIRS-y += xen-access
> >  SUBDIRS-y += xenstore
> > +SUBDIRS-$(CONFIG_HAS_PCI) += vpci
> >  
> >  .PHONY: all clean install distclean uninstall
> >  all clean distclean: %: subdirs-%
> > diff --git a/tools/tests/vpci/Makefile b/tools/tests/vpci/Makefile
> > new file mode 100644
> > index 0000000000..e45fcb5cd9
> > --- /dev/null
> > +++ b/tools/tests/vpci/Makefile
> > @@ -0,0 +1,37 @@
> > +XEN_ROOT=$(CURDIR)/../../..
> > +include $(XEN_ROOT)/tools/Rules.mk
> > +
> > +TARGET := test_vpci
> > +
> > +.PHONY: all
> > +all: $(TARGET)
> > +
> > +.PHONY: run
> > +run: $(TARGET)
> > +   ./$(TARGET)
> 
> Is such target useful? It will possibly break if you invoke it outside
> of this directory.

Seems to work fine if I do:

gmake -C tools/tests/vpci run

> > +
> > +$(TARGET): vpci.c vpci.h list.h main.c emul.h
> > +   $(HOSTCC) -g -o $@ vpci.c main.c
> > +
> > +.PHONY: clean
> > +clean:
> > +   rm -rf $(TARGET) *.o *~ vpci.h vpci.c list.h
> > +
> > +.PHONY: distclean
> > +distclean: clean
> > +
> > +.PHONY: install
> > +install:
> > +
> > +vpci.c: $(XEN_ROOT)/xen/drivers/vpci/vpci.c
> > +   # Trick the compiler so it doesn't complain about missing symbols
> > +   sed -e '/#include/d' \
> > +       -e '1s;^;#include "emul.h"\
> > +                vpci_register_init_t *const __start_vpci_array[1]\;\
> > +                vpci_register_init_t *const __end_vpci_array[1]\;\
> > +                ;' <$< >$@
> 
> This is prone to breakage. I think it would be better to just have a
> local stub file. Or maybe define it in your main.c?

I could add this to main.c, I haven't done so because the above it's
done for the linking to succeed, but it's not used in any way by the
test harness itself, or any of the functions called by it.

> > +
> > +int
> > +main(int argc, char **argv)
> > +{
> > +    /* Index storage by offset. */
> > +    uint32_t r0 = 0xdeadbeef;
> > +    uint8_t r5 = 0xef;
> > +    uint8_t r6 = 0xbe;
> > +    uint8_t r7 = 0xef;
> > +    uint16_t r12 = 0x8696;
> > +    uint8_t r16[4] = { };
> > +    uint16_t r20[2] = { };
> 
> Need to put 0 in the brackets.

Jan requested to remove explicitly setting it to 0:

https://marc.info/?l=xen-devel&m=149995672426469

The end result is the same, when doing initialization non-set fields
get set to 0.

> The code seems rather self-contained. Maybe we can write a fuzzer for
> it?

Probably, but I don't think we can expect the fuzzer to set the
handlers, in which case we would have to pre-populate the PCI space
with some handlers in order to perform the fuzzing, which kinds of
limits it's usefulness.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.