[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] x86/MSI-X: be more careful during teardown



On 10/03/15 16:27, Jan Beulich wrote:
> When a device gets detached from a guest, pciback will clear its
> command register, thus disabling both memory and I/O decoding. The
> disabled memory decoding, however, has an effect on the MSI-X table
> accesses the hypervisor does: These won't have the intended effect
> anymore. Even worse, for PCIe devices (but not SR-IOV virtual
> functions) such accesses may (will?) be treated as Unsupported
> Requests, causing respective errors to be surfaced, potentially in the
> form of NMIs that may be fatal to the hypervisor or Dom0 is different
> ways. Hence rather than carrying out these accesses, we should avoid
> them where we can, and use alternative (e.g. PCI config space based)
> mechanisms to achieve at least the same effect.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Backporting note (largely to myself):
>    Depends on (not yet backported) commit 061eebe0e "x86/MSI: drop
>    workaround for insecure Dom0 kernels" (due to re-use of struct
>    arch_msix's warned field).
>
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -121,6 +121,27 @@ static void msix_put_fixmap(struct arch_
>      spin_unlock(&msix->table_lock);
>  }
>  
> +static bool_t memory_decoded(const struct pci_dev *dev)
> +{
> +    u8 bus, slot, func;
> +
> +    if ( !dev->info.is_virtfn )
> +    {
> +        bus = dev->bus;
> +        slot = PCI_SLOT(dev->devfn);
> +        func = PCI_FUNC(dev->devfn);
> +    }
> +    else
> +    {
> +        bus = dev->info.physfn.bus;
> +        slot = PCI_SLOT(dev->info.physfn.devfn);
> +        func = PCI_FUNC(dev->info.physfn.devfn);
> +    }
> +
> +    return !!(pci_conf_read16(dev->seg, bus, slot, func, PCI_COMMAND) &
> +              PCI_COMMAND_MEMORY);
> +}
> +

This check is racy against anyone who can write to the command register,
which includes dom0 and other pcpus in Xen.  There does not appear to be
any exclusion between Xen emulating a control register write on one cpu
and changing irq affinities on another.

As a result, this check does not actually protect against accessing the
MSI-X bar while memory decoding is disabled.  As a downside, it puts an
expensive config space access on moderately frequent codepaths.

One issue we have just identified pertains to dom0 resetting a device
and Xen falling over a UR which has been escalated to fatal, most likely
because an in-progress MSI-X interrupt migration.  There does not appear
to be sufficient synchronisation available in the interface for a dom0
to even cooperatively perform a device reset with Xen.

The more I consider this and related problems, the more I am thinking
that the only longterm solution is to have a full PCI implementation in
Xen, and to prevent any unmediated access, including from dom0.  Xen
need not gain much (any?) more device-specific knowledge, but needs to
gain the ability to properly mediate all config updates, and synchronise
resets against other users of the device.  I do not suggest this
lightly; I realise that it is a huge quantity of work.

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.