[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] xen-pciback: optionally allow interrupt enable flag writes



On Wed, Jan 15, 2020 at 02:46:29AM +0100, Marek Marczykowski-Górecki wrote:
> QEMU running in a stubdom needs to be able to set INTX_DISABLE, and the
> MSI(-X) enable flags in the PCI config space. This adds an attribute
> 'allow_interrupt_control' which when set for a PCI device allows writes
> to this flag(s). The toolstack will need to set this for stubdoms.
> When enabled, guest (stubdomain) will be allowed to set relevant enable
> flags, but only one at a time - i.e. it refuses to enable more than one
> of INTx, MSI, MSI-X at a time.
> 
> This functionality is needed only for config space access done by device
> model (stubdomain) serving a HVM with the actual PCI device. It is not
> necessary and unsafe to enable direct access to those bits for PV domain
> with the device attached. For PV domains, there are separate protocol
> messages (XEN_PCI_OP_{enable,disable}_{msi,msix}) for this purpose.
> Those ops in addition to setting enable bits, also configure MSI(-X) in
> dom0 kernel - which is undesirable for PCI passthrough to HVM guests.
> 
> This should not introduce any new security issues since a malicious
> guest (or stubdom) can already generate MSIs through other ways, see
> [1] page 8. Additionally, when qemu runs in dom0, it already have direct
> access to those bits.
> 
> This is the second iteration of this feature. First was proposed as a
> direct Xen interface through a new hypercall, but ultimately it was
> rejected by the maintainer, because of mixing pciback and hypercalls for
> PCI config space access isn't a good design. Full discussion at [2].
> 
> [1]: 
> https://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf
> [2]: https://xen.markmail.org/thread/smpgpws4umdzizze
> 
> [part of the commit message and sysfs handling]
> Signed-off-by: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx>
> [the rest]
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Some minor nits below, but LGTM:

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> Changes in v4:
>  - fix incorrect variable used
>  - don't enable INTx when already enabled
> Changes in v3:
>  - return bitmap (or negative error) from
>    xen_pcibk_get_interrupt_type(), to implicitly handle cases when
>    multiple interrupt types are already enabled - disallow enabling in
>    that case
>  - add documentation
> Changes in v2:
>  - introduce xen_pcibk_get_interrupt_type() to deduplicate current
>    INTx/MSI/MSI-X state check
>  - fix checking MSI/MSI-X state on devices not supporting it
> ---
>  .../ABI/testing/sysfs-driver-pciback          | 13 +++
>  drivers/xen/xen-pciback/conf_space.c          | 36 ++++++++
>  drivers/xen/xen-pciback/conf_space.h          |  7 ++
>  .../xen/xen-pciback/conf_space_capability.c   | 88 +++++++++++++++++++
>  drivers/xen/xen-pciback/conf_space_header.c   | 19 ++++
>  drivers/xen/xen-pciback/pci_stub.c            | 66 ++++++++++++++
>  drivers/xen/xen-pciback/pciback.h             |  1 +
>  7 files changed, 230 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback 
> b/Documentation/ABI/testing/sysfs-driver-pciback
> index 6a733bfa37e6..566a11f2c12f 100644
> --- a/Documentation/ABI/testing/sysfs-driver-pciback
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -11,3 +11,16 @@ Description:
>                  #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
>                  will allow the guest to read and write to the configuration
>                  register 0x0E.
> +
> +What:           /sys/bus/pci/drivers/pciback/allow_interrupt_control
> +Date:           Jan 2020
> +KernelVersion:  5.5
> +Contact:        xen-devel@xxxxxxxxxxxxxxxxxxxx
> +Description:
> +                List of devices which can have interrupt control flag (INTx,
> +                MSI, MSI-X) set by a connected guest. It is meant to be set
> +                only when the guest is a stubdomain hosting device model 
> (qemu)
> +                and the actual device is assigned to a HVM. It is not safe
> +                (similar to permissive attribute) to set for a devices 
> assigned
> +                to a PV guest. The device is automatically removed from this
> +                list when the connected pcifront terminates.
> diff --git a/drivers/xen/xen-pciback/conf_space.c 
> b/drivers/xen/xen-pciback/conf_space.c
> index 60111719b01f..7697001e8ffc 100644
> --- a/drivers/xen/xen-pciback/conf_space.c
> +++ b/drivers/xen/xen-pciback/conf_space.c
> @@ -286,6 +286,42 @@ int xen_pcibk_config_write(struct pci_dev *dev, int 
> offset, int size, u32 value)
>       return xen_pcibios_err_to_errno(err);
>  }
>  
> +int xen_pcibk_get_interrupt_type(struct pci_dev *dev)

const for *dev.

> +{
> +     int err;
> +     u16 val;
> +     int ret = 0;
> +
> +     err = pci_read_config_word(dev, PCI_COMMAND, &val);
> +     if (err)
> +             return err;
> +     if (!(val & PCI_COMMAND_INTX_DISABLE))
> +             ret |= INTERRUPT_TYPE_INTX;
> +
> +     /* Do not trust dev->msi(x)_enabled here, as enabling could be done
> +      * bypassing the pci_*msi* functions, by the qemu.
> +      */
> +     if (dev->msi_cap) {
> +             err = pci_read_config_word(dev,
> +                             dev->msi_cap + PCI_MSI_FLAGS,
> +                             &val);
> +             if (err)
> +                     return err;
> +             if (val & PCI_MSI_FLAGS_ENABLE)
> +                     ret |= INTERRUPT_TYPE_MSI;
> +     }
> +     if (dev->msix_cap) {
> +             err = pci_read_config_word(dev,
> +                             dev->msix_cap + PCI_MSIX_FLAGS,
> +                             &val);
> +             if (err)
> +                     return err;
> +             if (val & PCI_MSIX_FLAGS_ENABLE)
> +                     ret |= INTERRUPT_TYPE_MSIX;
> +     }
> +     return ret;
> +}
> +
>  void xen_pcibk_config_free_dyn_fields(struct pci_dev *dev)
>  {
>       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(dev);
> diff --git a/drivers/xen/xen-pciback/conf_space.h 
> b/drivers/xen/xen-pciback/conf_space.h
> index 22db630717ea..6ba6aa26dcee 100644
> --- a/drivers/xen/xen-pciback/conf_space.h
> +++ b/drivers/xen/xen-pciback/conf_space.h
> @@ -65,6 +65,11 @@ struct config_field_entry {
>       void *data;
>  };
>  
> +#define INTERRUPT_TYPE_NONE 0
> +#define INTERRUPT_TYPE_INTX 1
> +#define INTERRUPT_TYPE_MSI  2
> +#define INTERRUPT_TYPE_MSIX 4

Nit: those being a bitmap I would write them as:

#define INTERRUPT_TYPE_NONE (1<<0)
#define INTERRUPT_TYPE_INTX (1<<1)
#define INTERRUPT_TYPE_MSI  (1<<2)
#define INTERRUPT_TYPE_MSIX (1<<3)

And would place them together with the function prototype below where
they are used.

> +
>  extern bool xen_pcibk_permissive;
>  
>  #define OFFSET(cfg_entry) 
> ((cfg_entry)->base_offset+(cfg_entry)->field->offset)
> @@ -126,4 +131,6 @@ int xen_pcibk_config_capability_init(void);
>  int xen_pcibk_config_header_add_fields(struct pci_dev *dev);
>  int xen_pcibk_config_capability_add_fields(struct pci_dev *dev);
>  
> +int xen_pcibk_get_interrupt_type(struct pci_dev *dev);
> +
>  #endif                               /* __XEN_PCIBACK_CONF_SPACE_H__ */
> diff --git a/drivers/xen/xen-pciback/conf_space_capability.c 
> b/drivers/xen/xen-pciback/conf_space_capability.c
> index e5694133ebe5..d3a846119974 100644
> --- a/drivers/xen/xen-pciback/conf_space_capability.c
> +++ b/drivers/xen/xen-pciback/conf_space_capability.c
> @@ -189,6 +189,84 @@ static const struct config_field caplist_pm[] = {
>       {}
>  };
>  
> +static struct msi_msix_field_config {
> +     u16 enable_bit; /* bit for enabling MSI/MSI-X */
> +     int int_type; /* interrupt type for exclusiveness check */

unsigned int would be a better fit here, since you will never store a
negative value AFAICT.

> +} msi_field_config = {
> +     .enable_bit = PCI_MSI_FLAGS_ENABLE,
> +     .int_type = INTERRUPT_TYPE_MSI,
> +}, msix_field_config = {
> +     .enable_bit = PCI_MSIX_FLAGS_ENABLE,
> +     .int_type = INTERRUPT_TYPE_MSIX,
> +};
> +
> +static void *msi_field_init(struct pci_dev *dev, int offset)
> +{
> +     return &msi_field_config;
> +}
> +
> +static void *msix_field_init(struct pci_dev *dev, int offset)
> +{
> +     return &msix_field_config;
> +}
> +
> +static int msi_msix_flags_write(struct pci_dev *dev, int offset, u16 
> new_value,
> +                             void *data)
> +{
> +     int err;
> +     u16 old_value;
> +     const struct msi_msix_field_config *field_config = data;
> +     const struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(dev);
> +
> +     if (xen_pcibk_permissive || dev_data->permissive)
> +             goto write;
> +
> +     err = pci_read_config_word(dev, offset, &old_value);
> +     if (err)
> +             return err;
> +
> +     if (new_value == old_value)
> +             return 0;
> +
> +     if (!dev_data->allow_interrupt_control ||
> +         (new_value ^ old_value) & ~field_config->enable_bit)
> +             return PCIBIOS_SET_FAILED;
> +
> +     if (new_value & field_config->enable_bit) {
> +             /* don't allow enabling together with other interrupt types */
> +             int int_type = xen_pcibk_get_interrupt_type(dev);

A newline here would make this easier to read I think.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.