|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] pci: a workaround for nonstandard PCI devices whose PBA shares
On Wed, Apr 04, 2018 at 11:29:39PM +0800, Chao Gao wrote:
> ... the same page with other registers which are not relevant to MSI-X. Xen
> marks pages where PBA resides as read-only. When assigning such devices to
> guest, device driver writes MSI-X irrelevant registers on those pages would
> lead to an EPT violation and the guest is destroyed because no handler is
> registered for those address range. In order to make guest capable to use such
> kind of devices, trapping very frequent write accesses is not a good idea for
> it would significantly impact the performance.
>
> This patch provides a workaround with caveat. Specifically, an option is
> introduced to specify a list of devices. For those devices, Xen doesn't
> control the access right to pages where PBA resides. Hence, guest device
> driver is able to write those pages and functions well. Note that adding an
> untrusted device to this option may endanger security of the entire system.
>
> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> ---
> docs/misc/xen-command-line.markdown | 10 +++++++++
> xen/arch/x86/msi.c | 7 ++++--
> xen/drivers/passthrough/pci.c | 45
> +++++++++++++++++++++++++++++++++++--
> xen/include/asm-x86/msi.h | 1 +
> 4 files changed, 59 insertions(+), 4 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.markdown
> b/docs/misc/xen-command-line.markdown
> index b353352..e382513 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1423,6 +1423,16 @@ Defaults to booting secondary processors.
>
> > Default: `on`
>
> +### pba\_quirk
pba_write_allowed would be better, pba_quirk is too generic IMO.
> +> `= List of [<seg>:]<bus>:<device>.<function>
> +
> +Specify a list of SBDF of devices. When assigning devices in this list to
> +guest, reading or writing the page where MSI-X PBA resides are allowed.
> +This option provides a workaround for nonstandard PCI devices whose
> +MSI-X PBA shares the same 4K-byte page with other registers. Note that
> +adding an untrusted device to this option would undermine security of
> +the entire system.
> +
> ### pci
> > `= {no-}serr | {no-}perr`
>
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index 5567990..2abf2cf 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -992,7 +992,9 @@ static int msix_capability_init(struct pci_dev *dev,
> if ( rangeset_add_range(mmio_ro_ranges, msix->table.first,
> msix->table.last) )
> WARN();
> - if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
> +
> + if ( !msix->pba_quirk_enabled &&
> + rangeset_add_range(mmio_ro_ranges, msix->pba.first,
> msix->pba.last) )
> WARN();
This will work fine as long as the PBA is not in the same page as the
MSI-X table. In such case you will also need changes to QEMU (see
pci_msix_write), so that writes to the memory in the same page as the
MSI-X/PBA tables are forwarded to the underlying hardware.
You should add least add something like:
if ( msix->pba_quirk_enabled &&
msix->table.first <= msix->pba.last &&
msix->pba.first <= msix->table.last )
{
printk("PBA write not allowed to dev %04x:%02x:%02x.%u due to MSI-X table
overlap\n");
return -ENXIO;
}
Or similar if the QEMU side is not fixed.
Note that in order to fix the QEMU side you would probably have to add
a flag to xl 'pci' config option and pass it to both QEMU and Xen.
>
> @@ -1139,7 +1141,8 @@ static void _pci_cleanup_msix(struct arch_msix *msix)
> if ( rangeset_remove_range(mmio_ro_ranges, msix->table.first,
> msix->table.last) )
> WARN();
> - if ( rangeset_remove_range(mmio_ro_ranges, msix->pba.first,
> + if ( !msix->pba_quirk_enabled &&
> + rangeset_remove_range(mmio_ro_ranges, msix->pba.first,
> msix->pba.last) )
> WARN();
> }
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 1db69d5..cd765ef 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -184,6 +184,38 @@ static int __init parse_phantom_dev(const char *str)
> }
> custom_param("pci-phantom", parse_phantom_dev);
>
> +static struct pba_quirk_dev {
> + uint32_t sbdf;
> +} pba_quirk_devs[8];
We have a sbdf type now, see 514f58.
Also, I would prefer that you use a list here. I know it's not likely
to have a huge number of devices in the system that require this
quirk, but I also see no reason to limit this to 8 (or any other
arbitrary value).
> +static unsigned int nr_pba_quirk_devs;
> +
> +static int __init parse_pba_quirk(const char *str)
> +{
> + unsigned int seg, bus, dev, func;
> +
> + for ( ; ; )
> + {
> + if ( nr_pba_quirk_devs >= ARRAY_SIZE(pba_quirk_devs) )
> + return -E2BIG;
> +
> + str = parse_pci(str, &seg, &bus, &dev, &func);
> + if ( !str )
> + return -EINVAL;
> +
> + pba_quirk_devs[nr_pba_quirk_devs++].sbdf = PCI_SBDF(seg, bus, dev,
> + func);
> + if ( *str == ',' )
> + str++;
> + else if ( !*str )
> + break;
> + else
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +custom_param("pba_quirk", parse_pba_quirk);
> +
> static u16 __read_mostly command_mask;
> static u16 __read_mostly bridge_ctl_mask;
>
> @@ -300,6 +332,7 @@ static void check_pdev(const struct pci_dev *pdev)
>
> static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
> {
> + unsigned int i;
> struct pci_dev *pdev;
>
> list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> @@ -328,6 +361,16 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg,
> u8 bus, u8 devfn)
> }
> spin_lock_init(&msix->table_lock);
> pdev->msix = msix;
> +
> + for ( i = 0; i < nr_pba_quirk_devs; i++ )
> + {
> + if ( pba_quirk_devs[i].sbdf == PCI_SBDF3(pseg->nr, bus, devfn) )
> + {
> + pdev->msix->pba_quirk_enabled = true;
> + printk(XENLOG_WARNING "Enable PBA quirk for
> %04x:%02x:%02x.%u\n",
> + pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> + }
You could do this as:
if ( pba_quirk_devs[i].sbdf != PCI_SBDF3(pseg->nr, bus, devfn) )
continue;
pdev->msix->pba_quirk_enabled = true;
printk(XENLOG_WARNING "Enable PBA quirk for %04x:%02x:%02x.%u\n",
pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
break;
Which removes one level of indentation (and also exits the loop as
soon as a match is found).
Thanks, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |