[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] x86/pci: Fix racy accesses to MSI-X Control register


  • To: David Vrabel <dvrabel@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 12 Dec 2022 18:04:22 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yAef7jgG5HuSLH7kh4g4SB4CLihc+yHiqlE1uVmyxfM=; b=FShIP0YGl5rnXXN9xDQqCq+ym5yHobzhuI5IMgumWE5CqZd6u6cgIrwkYi+UEaYu+7bW/GZ47HjevdpBRb6wiKDivkfsuhi9GZuXR4Fhs1ZILBy6SWxLbX+HjwmGZ3PjVqmz6HE7+6NJdIOGIlhLuQB3FYK1rBqnmNT+T1Rl6JqUZ2rNLnXKEfkZXRNX4Warzemi4oA/vKqZy5pQRlyFMRFbd6arLR7qqNLlq+QSdeNe+MMcyXPLHvTJUOQ/PH2gwZT/nrDkp/wsnxkes9jlLNYfd2gcYZCy/QNNDSCSmwLwNDX/ObS1jAmoJPfWbDVRmvCSG83TX6ErtzbMVFEI0A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eGU/ca6AZNYuSRxlh0ap7eDCcbfy7zAODC/WaolCLEHNHm2Cq6w8oiFI7lu/4GPzmpp4g1bvXKR5K/16xROQKh2PCdK9F7iCyqwUdDujddbYZziUj+KSIAleg4MUaTcjpsaTSOYEFUnHWVnP8hDmk/JClyN27wsHuqES4fJ59I/6TWQ4poA7O/Hqcb649qzdv507tu8x1ziD43voFyFFkuPF99A99JWylRMD2500oyuWKD+LaCWbsaYjboI0e9FFEtdLDvMoinrBrOvju/dM0Kviz2D3++3kX1U4MaDg0XVbt4ke8DNhweKM7U2ZNKa3vMQFNDqL4TaMvrcRj8tMqw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, David Vrabel <dvrabel@xxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 12 Dec 2022 17:04:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10.11.2022 17:59, David Vrabel wrote:
> Concurrent access the the MSI-X control register are not serialized
> with a suitable lock. For example, in msix_capability_init() access
> use the pcidevs_lock() but some calls to msi_set_mask_bit() use the
> interrupt descriptor lock.
> 
> This can lead to MSI-X being incorrectly disabled and subsequent
> failures due to msix_memory_decoded() calls that check for MSI-X being
> enabled.
> 
> This was seen with some non-compliant hardware that gated MSI-X
> messages on the per-vector mask bit only (i.e., the MSI-X Enable bit
> and Function Mask bits in the MSI-X Control register were ignored). An
> interrupt (with a pending move) for vector 0 would occur while vector
> 1 was being initialized in msix_capability_init(). Updates the the
> Control register would race and the vector 1 initialization would
> intermittently fail with -ENXIO.
> 
> Typically a race between initializing a vector and another vector
> being moved doesn't occur because:
> 
> 1. Racy Control accesses only occur when MSI-X is (guest) disabled
> 
> 2  Hardware should only raise interrupts when MSI-X is enabled and unmasked.
> 
> 3. Xen always sets Function Mask when temporarily enabling MSI-X.
> 
> But there may be other race conditions depending on hardware and guest
> driver behaviour (e.g., disabling MSI-X when a IRQ has a pending move
> on another PCPU).
> 
> Fix this by:
> 
> 1. Tracking the host and guest enable state in a similar way to the
>    host and guest maskall state. Note that since multiple CPUs can be
>    updating different vectors concurrently, a counter is needed for
>    the host enable state.
> 
> 2. Add a new lock for serialize the Control read/modify/write
>    sequence.
> 
> 3. Wrap the above in two helper functions (msix_update_lock(), and
>    msix_update_unlock()), which bracket any MSI-X register updates
>    that require MSI-X to be (temporarily) enabled.
> 
> Signed-off-by: David Vrabel <dvrabel@xxxxxxxxxxxx>

Just a couple of mechanical / style comments, at least for now:

> SIM: https://t.corp.amazon.com/P63914633
> 
> CR: https://code.amazon.com/reviews/CR-79020945

These tags shouldn't be here, unless they have a meaning in the
"upstream" context.

> --- a/xen/arch/x86/include/asm/msi.h
> +++ b/xen/arch/x86/include/asm/msi.h
> @@ -237,7 +237,10 @@ struct arch_msix {
>      int table_refcnt[MAX_MSIX_TABLE_PAGES];
>      int table_idx[MAX_MSIX_TABLE_PAGES];
>      spinlock_t table_lock;
> +    spinlock_t control_lock;
>      bool host_maskall, guest_maskall;
> +    uint16_t host_enable;

If you want to keep this more narrow than "unsigned int", then please
add a BUILD_BUG_ON() against NR_CPUS, so the need to update the field
can be easily noticed (in some perhaps distant future).

> +static void msix_update_unlock(struct pci_dev *dev, unsigned int pos, 
> uint16_t control)
> +{
> +    uint16_t new_control;
> +    unsigned long flags;
> +
> +    spin_lock_irqsave(&dev->msix->control_lock, flags);
> +
> +    dev->msix->host_enable--;
> +
> +    new_control = control & ~(PCI_MSIX_FLAGS_ENABLE | 
> PCI_MSIX_FLAGS_MASKALL);
> +
> +    if ( dev->msix->host_enable || dev->msix->guest_enable )
> +        new_control |= PCI_MSIX_FLAGS_ENABLE;
> +    if ( dev->msix->host_maskall || dev->msix->guest_maskall || 
> dev->msix->host_enable )
> +        new_control |= PCI_MSIX_FLAGS_MASKALL;

In particular this use of "host_enable" suggests the field wants to be
named differently: It makes no sense to derive MASKALL from ENABLE
without it being clear (from the name) that the field is an init-time
override only.

> @@ -299,11 +350,16 @@ static void msix_set_enable(struct pci_dev *dev, int 
> enable)
>      pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSIX);
>      if ( pos )
>      {
> +        spin_lock_irq(&dev->msix->control_lock);
> +
> +        dev->msix->host_enable = false;

You explicitly say this isn't a boolean, so the initializer would better
be 0.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.