|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Monday, March 9, 2020 7:09 PM
>
> Containing still in flight DMA was introduced to work around certain
> devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
> Passing through (such) devices (on such systems) is inherently insecure
> (as guests could easily arrange for IOMMU faults of any kind to occur).
> Defaulting to a mode where admins may not even become aware of issues
> with devices can be considered undesirable. Therefore convert this mode
> of operation to an optional one, not one enabled by default.
>
> This involves resurrecting code commit ea38867831da ("x86 / iommu: set
> up a scratch page in the quarantine domain") did remove, in a slightly
> extended and abstracted fashion. Here, instead of reintroducing a pretty
> pointless use of "goto" in domain_context_unmap(), and instead of making
> the function (at least temporarily) inconsistent, take the opportunity
> and replace the other similarly pointless "goto" as well.
>
> In order to key the re-instated bypasses off of there (not) being a root
> page table this further requires moving the allocate_domain_resources()
> invocation from reassign_device() to amd_iommu_setup_domain_device()
> (or
> else reassign_device() would allocate a root page table anyway); this is
> benign to the second caller of the latter function.
>
> Take the opportunity and also limit the control to builds supporting
> PCI.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> I'm happy to take better suggestions to replace the "full" command line
> option and Kconfig prompt tokens. I don't think though that "fault" and
> "write-fault" are really suitable there.
I think we may just allow both r/w access to scratch page for such bogus
device, which may make 'full' more reasonable since we now fully
contain in-fly DMAs. I'm not sure about the value of keeping write-fault
alone for such devices (just because one observed his specific device only
has problem with read-fault).
alternatively I also thought about whether whitelisting the problematic
devices through another option (e.g. nofault=b:d:f) could provide more
value. In concept any IOMMU page table (dom0, dom_io or domU)
for such bogus device should not include invalid entry, even when
quarantine is not specified. However I'm not sure whether it's worthy of
going so far...
> ---
> This patch contextually depends on "[PATCH v2 0/5] IOMMU: restrict
> visibility/scope if certain variables".
> ---
> v3: IOMMU_quarantine_basic -> IOMMU_quarantine_fault,
> IOMMU_quarantine_full -> IOMMU_quarantine_write_fault. Kconfig
> option (choice) to select default. Limit to HAS_PCI.
> v2: Don't use true/false. Introduce QUARANTINE_SKIP() (albeit I'm not
> really convinced this is an improvement). Add comment.
>
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1238,7 +1238,7 @@ detection of systems known to misbehave
> > Default: `new` unless directed-EOI is supported
>
> ### iommu
> - = List of [ <bool>, verbose, debug, force, required, quarantine,
> + = List of [ <bool>, verbose, debug, force, required, quarantine[=full],
> sharept, intremap, intpost, crash-disable,
> snoop, qinval, igfx, amd-iommu-perdev-intremap,
> dom0-{passthrough,strict} ]
> @@ -1276,11 +1276,15 @@ boolean (e.g. `iommu=no`) can override t
> will prevent Xen from booting if IOMMUs aren't discovered and enabled
> successfully.
>
> -* The `quarantine` boolean can be used to control Xen's behavior when
> - de-assigning devices from guests. If enabled (the default), Xen always
> - quarantines such devices; they must be explicitly assigned back to Dom0
> - before they can be used there again. If disabled, Xen will only
> - quarantine devices the toolstack hass arranged for getting quarantined.
> +* The `quarantine` option can be used to control Xen's behavior when
> + de-assigning devices from guests. If set to true (the default), Xen
> + always quarantines such devices; they must be explicitly assigned back
> + to Dom0 before they can be used there again. If set to "full", still
> + active DMA will additionally be directed to a "sink" page. If set to
> + false, Xen will only quarantine devices the toolstack has arranged for
> + getting quarantined.
> +
> + This option is only valid on builds supporting PCI.
>
> * The `sharept` boolean controls whether the IOMMU pagetables are
> shared
> with the CPU-side HAP pagetables, or allocated separately. Sharing
> --- a/xen/drivers/passthrough/Kconfig
> +++ b/xen/drivers/passthrough/Kconfig
> @@ -28,3 +28,31 @@ endif
>
> config IOMMU_FORCE_PT_SHARE
> bool
> +
> +choice
> + prompt "IOMMU device quarantining default behavior"
> + depends on HAS_PCI
> + default IOMMU_QUARANTINE_BASIC
> + ---help---
> + When a PCI device is assigned to an untrusted domain, it is possible
> + for that domain to program the device to DMA to an arbitrary
> address.
> + The IOMMU is used to protect the host from malicious DMA by
> making
> + sure that the device addresses can only target memory assigned to
> the
> + guest. However, when the guest domain is torn down, assigning the
> + device back to the hardware domain would allow any in-flight DMA
> to
> + potentially target critical host data. To avoid this, quarantining
> + shold be enabled. Quarantining can be done in two ways: In its
> basic
> + form, all in-flight DMA will simply be forced to encounter IOMMU
> + faults. Since there are systems where doing so can cause host
> + lockup, an alternative form is available where writes to memory will
> + be made fault, but reads will be directed to a dummy page. The
> + implication here is that such reads will go unnoticed, i.e. an admin
> + may not become aware of the underlying problem.
> +
> + config IOMMU_QUARANTINE_NONE
> + bool "none"
> + config IOMMU_QUARANTINE_BASIC
> + bool "basic"
> + config IOMMU_QUARANTINE_FULL
> + bool "full"
> +endchoice
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -25,6 +25,9 @@
> #include "iommu.h"
> #include "../ats.h"
>
> +/* dom_io is used as a sentinel for quarantined devices */
> +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)-
> >arch.root_table)
> +
> static bool_t __read_mostly init_done;
>
> static const struct iommu_init_ops _iommu_init_ops;
> @@ -82,18 +85,35 @@ int get_dma_requestor_id(uint16_t seg, u
> return req_id;
> }
>
> -static void amd_iommu_setup_domain_device(
> +static int __must_check allocate_domain_resources(struct domain_iommu
> *hd)
> +{
> + int rc;
> +
> + spin_lock(&hd->arch.mapping_lock);
> + rc = amd_iommu_alloc_root(hd);
> + spin_unlock(&hd->arch.mapping_lock);
> +
> + return rc;
> +}
> +
> +static int __must_check amd_iommu_setup_domain_device(
> struct domain *domain, struct amd_iommu *iommu,
> uint8_t devfn, struct pci_dev *pdev)
> {
> struct amd_iommu_dte *table, *dte;
> unsigned long flags;
> - int req_id, valid = 1;
> + int req_id, valid = 1, rc;
> u8 bus = pdev->bus;
> - const struct domain_iommu *hd = dom_iommu(domain);
> + struct domain_iommu *hd = dom_iommu(domain);
> +
> + if ( QUARANTINE_SKIP(domain) )
> + return 0;
> +
> + BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer);
>
> - BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode ||
> - !iommu->dev_table.buffer );
> + rc = allocate_domain_resources(hd);
> + if ( rc )
> + return rc;
>
> if ( iommu_hwdom_passthrough && is_hardware_domain(domain) )
> valid = 0;
> @@ -148,6 +168,8 @@ static void amd_iommu_setup_domain_devic
>
> amd_iommu_flush_iotlb(devfn, pdev,
> INV_IOMMU_ALL_PAGES_ADDRESS, 0);
> }
> +
> + return 0;
> }
>
> int __init acpi_ivrs_init(void)
> @@ -217,17 +239,6 @@ int amd_iommu_alloc_root(struct domain_i
> return 0;
> }
>
> -static int __must_check allocate_domain_resources(struct domain_iommu
> *hd)
> -{
> - int rc;
> -
> - spin_lock(&hd->arch.mapping_lock);
> - rc = amd_iommu_alloc_root(hd);
> - spin_unlock(&hd->arch.mapping_lock);
> -
> - return rc;
> -}
> -
> int amd_iommu_get_paging_mode(unsigned long entries)
> {
> int level = 1;
> @@ -291,6 +302,9 @@ static void amd_iommu_disable_domain_dev
> int req_id;
> u8 bus = pdev->bus;
>
> + if ( QUARANTINE_SKIP(domain) )
> + return;
> +
> BUG_ON ( iommu->dev_table.buffer == NULL );
> req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn));
> table = iommu->dev_table.buffer;
> @@ -337,7 +351,6 @@ static int reassign_device(struct domain
> {
> struct amd_iommu *iommu;
> int bdf, rc;
> - struct domain_iommu *t = dom_iommu(target);
>
> bdf = PCI_BDF2(pdev->bus, pdev->devfn);
> iommu = find_iommu_for_device(pdev->seg, bdf);
> @@ -358,11 +371,10 @@ static int reassign_device(struct domain
> pdev->domain = target;
> }
>
> - rc = allocate_domain_resources(t);
> + rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
> if ( rc )
> return rc;
>
> - amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
> AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to
> dom%d\n",
> pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
> source->domain_id, target->domain_id);
> @@ -519,8 +531,7 @@ static int amd_iommu_add_device(u8 devfn
> spin_unlock_irqrestore(&iommu->lock, flags);
> }
>
> - amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
> - return 0;
> + return amd_iommu_setup_domain_device(pdev->domain, iommu, devfn,
> pdev);
> }
>
> static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -31,9 +31,24 @@ bool_t __initdata iommu_enable = 1;
> bool_t __read_mostly iommu_enabled;
> bool_t __read_mostly force_iommu;
> bool_t __read_mostly iommu_verbose;
> -bool __read_mostly iommu_quarantine = true;
> bool_t __read_mostly iommu_crash_disable;
>
> +#define IOMMU_quarantine_none 0 /* aka false */
> +#define IOMMU_quarantine_fault 1 /* aka true */
> +#define IOMMU_quarantine_write_fault 2
> +#ifdef CONFIG_HAS_PCI
> +uint8_t __read_mostly iommu_quarantine =
> +# if defined(CONFIG_IOMMU_QUARANTINE_NONE)
> + IOMMU_quarantine_none;
> +# elif defined(CONFIG_IOMMU_QUARANTINE_BASIC)
> + IOMMU_quarantine_fault;
> +# elif defined(CONFIG_IOMMU_QUARANTINE_FULL)
> + IOMMU_quarantine_write_fault;
> +# endif
> +#else
> +# define iommu_quarantine IOMMU_quarantine_none
> +#endif /* CONFIG_HAS_PCI */
> +
> static bool __hwdom_initdata iommu_hwdom_none;
> bool __hwdom_initdata iommu_hwdom_strict;
> bool __read_mostly iommu_hwdom_passthrough;
> @@ -68,8 +83,12 @@ static int __init parse_iommu_param(cons
> else if ( (val = parse_boolean("force", s, ss)) >= 0 ||
> (val = parse_boolean("required", s, ss)) >= 0 )
> force_iommu = val;
> +#ifdef CONFIG_HAS_PCI
> else if ( (val = parse_boolean("quarantine", s, ss)) >= 0 )
> iommu_quarantine = val;
> + else if ( ss == s + 15 && !strncmp(s, "quarantine=full", 15) )
> + iommu_quarantine = IOMMU_quarantine_write_fault;
> +#endif
> #ifdef CONFIG_X86
> else if ( (val = parse_boolean("igfx", s, ss)) >= 0 )
> iommu_igfx = val;
> @@ -448,7 +467,7 @@ static int __init iommu_quarantine_init(
> dom_io->options |= XEN_DOMCTL_CDF_iommu;
>
> rc = iommu_domain_init(dom_io, 0);
> - if ( rc )
> + if ( rc || iommu_quarantine < IOMMU_quarantine_write_fault )
> return rc;
>
> if ( !hd->platform_ops->quarantine_init )
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -41,6 +41,9 @@
> #include "vtd.h"
> #include "../ats.h"
>
> +/* dom_io is used as a sentinel for quarantined devices */
> +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)-
> >arch.pgd_maddr)
> +
> struct mapped_rmrr {
> struct list_head list;
> u64 base, end;
> @@ -1294,6 +1297,9 @@ int domain_context_mapping_one(
> int agaw, rc, ret;
> bool_t flush_dev_iotlb;
>
> + if ( QUARANTINE_SKIP(domain) )
> + return 0;
> +
> ASSERT(pcidevs_locked());
> spin_lock(&iommu->lock);
> maddr = bus_to_context_maddr(iommu, bus);
> @@ -1548,6 +1554,9 @@ int domain_context_unmap_one(
> int iommu_domid, rc, ret;
> bool_t flush_dev_iotlb;
>
> + if ( QUARANTINE_SKIP(domain) )
> + return 0;
> +
> ASSERT(pcidevs_locked());
> spin_lock(&iommu->lock);
>
> @@ -1609,7 +1618,7 @@ static int domain_context_unmap(struct d
> {
> struct acpi_drhd_unit *drhd;
> struct vtd_iommu *iommu;
> - int ret = 0;
> + int ret;
> u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus;
> int found = 0;
>
> @@ -1625,14 +1634,12 @@ static int domain_context_unmap(struct d
> printk(VTDPREFIX "d%d:Hostbridge: skip %04x:%02x:%02x.%u
> unmap\n",
> domain->domain_id, seg, bus,
> PCI_SLOT(devfn), PCI_FUNC(devfn));
> - if ( !is_hardware_domain(domain) )
> - return -EPERM;
> - goto out;
> + return is_hardware_domain(domain) ? 0 : -EPERM;
>
> case DEV_TYPE_PCIe_BRIDGE:
> case DEV_TYPE_PCIe2PCI_BRIDGE:
> case DEV_TYPE_LEGACY_PCI_BRIDGE:
> - goto out;
> + return 0;
>
> case DEV_TYPE_PCIe_ENDPOINT:
> if ( iommu_debug )
> @@ -1676,10 +1683,12 @@ static int domain_context_unmap(struct d
> dprintk(XENLOG_ERR VTDPREFIX,
> "d%d:unknown(%u): %04x:%02x:%02x.%u\n",
> domain->domain_id, pdev->type,
> seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> + if ( QUARANTINE_SKIP(domain) )
> + return ret;
> +
> /*
> * if no other devices under the same iommu owned by this domain,
> * clear iommu in iommu_bitmap and clear domain_id in domid_bitmp
> @@ -1705,16 +1714,12 @@ static int domain_context_unmap(struct d
>
> iommu_domid = domain_iommu_domid(domain, iommu);
> if ( iommu_domid == -1 )
> - {
> - ret = -EINVAL;
> - goto out;
> - }
> + return -EINVAL;
>
> clear_bit(iommu_domid, iommu->domid_bitmap);
> iommu->domid_map[iommu_domid] = 0;
> }
>
> -out:
> return ret;
> }
>
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -53,7 +53,9 @@ static inline bool_t dfn_eq(dfn_t x, dfn
> }
>
> extern bool_t iommu_enable, iommu_enabled;
> -extern bool force_iommu, iommu_quarantine, iommu_verbose;
> +extern bool force_iommu, iommu_verbose;
> +/* Boolean except for the specific purposes of
> drivers/passthrough/iommu.c. */
> +extern uint8_t iommu_quarantine;
>
> #ifdef CONFIG_X86
> extern enum __packed iommu_intremap {
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |