|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 01/12] vpci: introduce per-domain lock to protect vpci structure
On Tue, Jun 13, 2023 at 10:32:26AM +0000, Volodymyr Babchuk wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>
> Introduce a per-domain read/write lock to check whether vpci is present,
> so we are sure there are no accesses to the contents of the vpci struct
> if not. This lock can be used (and in a few cases is used right away)
> so that vpci removal can be performed while holding the lock in write
> mode. Previously such removal could race with vpci_read for example.
>
> 1. Per-domain's vpci_rwlock is used to protect pdev->vpci structure
> from being removed.
>
> 2. Writing the command register and ROM BAR register may trigger
> modify_bars to run, which in turn may access multiple pdevs while
> checking for the existing BAR's overlap. The overlapping check, if done
> under the read lock, requires vpci->lock to be acquired on both devices
> being compared, which may produce a deadlock. It is not possible to
> upgrade read lock to write lock in such a case. So, in order to prevent
> the deadlock, check which registers are going to be written and acquire
> the lock in the appropriate mode from the beginning.
>
> All other code, which doesn't lead to pdev->vpci destruction and does not
> access multiple pdevs at the same time, can still use a combination of the
> read lock and pdev->vpci->lock.
>
> 3. Optimize if ROM BAR write lock required detection by caching offset
> of the ROM BAR register in vpci->header->rom_reg which depends on
> header's type.
>
> 4. Reduce locked region in vpci_remove_device as it is now possible
> to set pdev->vpci to NULL early right after the write lock is acquired.
>
> 5. Reduce locked region in vpci_add_handlers as it is possible to
> initialize many more fields of the struct vpci before assigning it to
> pdev->vpci.
>
> 6. vpci_{add|remove}_register are required to be called with the write lock
> held, but it is not feasible to add an assert there as it requires
> struct domain to be passed for that. So, add a comment about this requirement
> to these and other functions with the equivalent constraints.
>
> 7. Drop const qualifier where the new rwlock is used and this is appropriate.
>
> 8. Do not call process_pending_softirqs with any locks held. For that unlock
> prior the call and re-acquire the locks after. After re-acquiring the
> lock there is no need to check if pdev->vpci exists:
> - in apply_map because of the context it is called (no race condition
> possible)
> - for MSI/MSI-X debug code because it is called at the end of
> pdev->vpci access and no further access to pdev->vpci is made
>
> 9. Check for !pdev->vpci in vpci_{read|write} after acquiring the lock
> and if so, allow reading or writing the hardware register directly. This is
> acceptable as we only deal with Dom0 as of now. Once DomU support is
> added the write will need to be ignored and read return all 0's for the
> guests, while Dom0 can still access the registers directly.
>
> 10. Introduce pcidevs_trylock, so there is a possibility to try locking
> the pcidev's lock.
>
> 11. Use pcidev's lock around for_each_pdev and pci_get_pdev_by_domain
> while accessing pdevs in vpci code.
>
> 12. This is based on the discussion at [1].
>
> [1] https://lore.kernel.org/all/20220204063459.680961-4-andr2000@xxxxxxxxx/
>
> Suggested-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Suggested-by: Jan Beulich <jbeulich@xxxxxxxx>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
Thanks.
I haven't looked in full detail, but I'm afraid there's an ABBA
deadlock with the per-domain vpci lock and the pcidevs lock in
modify_bars() vs vpci_add_handlers() and vpci_remove_device().
I've made some comments below.
> ---
> This was checked on x86: with and without PVH Dom0.
> ---
> xen/arch/x86/hvm/vmsi.c | 7 +++
> xen/common/domain.c | 3 +
> xen/drivers/passthrough/pci.c | 5 ++
> xen/drivers/vpci/header.c | 52 +++++++++++++++++
> xen/drivers/vpci/msi.c | 25 +++++++-
> xen/drivers/vpci/msix.c | 51 +++++++++++++---
> xen/drivers/vpci/vpci.c | 107 +++++++++++++++++++++++++---------
> xen/include/xen/pci.h | 1 +
> xen/include/xen/sched.h | 3 +
> xen/include/xen/vpci.h | 6 ++
> 10 files changed, 223 insertions(+), 37 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index 3cd4923060..f188450b1b 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -895,6 +895,9 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
> {
> unsigned int i;
>
> + ASSERT(rw_is_locked(&msix->pdev->domain->vpci_rwlock));
> + ASSERT(pcidevs_locked());
> +
> for ( i = 0; i < msix->max_entries; i++ )
> {
> const struct vpci_msix_entry *entry = &msix->entries[i];
> @@ -913,7 +916,11 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
> struct pci_dev *pdev = msix->pdev;
>
> spin_unlock(&msix->pdev->vpci->lock);
> + pcidevs_unlock();
> + read_unlock(&pdev->domain->vpci_rwlock);
> process_pending_softirqs();
> + read_lock(&pdev->domain->vpci_rwlock);
> + pcidevs_lock();
Why do you need the pcidevs_lock here? Isn't it enough to have the
per-domain rwlock taken in read mode in order to prevent devices from
being de-assigned from the domain?
> /* NB: we assume that pdev cannot go away for an alive domain. */
> if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
> return -EBUSY;
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 6a440590fe..a4640acb62 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -622,6 +622,9 @@ struct domain *domain_create(domid_t domid,
>
> #ifdef CONFIG_HAS_PCI
> INIT_LIST_HEAD(&d->pdev_list);
> +#ifdef CONFIG_HAS_VPCI
> + rwlock_init(&d->vpci_rwlock);
> +#endif
> #endif
>
> /* All error paths can depend on the above setup. */
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 07d1986d33..0afcb59af0 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -57,6 +57,11 @@ void pcidevs_lock(void)
> spin_lock_recursive(&_pcidevs_lock);
> }
>
> +int pcidevs_trylock(void)
> +{
> + return spin_trylock_recursive(&_pcidevs_lock);
> +}
> +
> void pcidevs_unlock(void)
> {
> spin_unlock_recursive(&_pcidevs_lock);
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ec2e978a4e..23b5223adf 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -152,12 +152,14 @@ bool vpci_process_pending(struct vcpu *v)
> if ( rc == -ERESTART )
> return true;
>
> + read_lock(&v->domain->vpci_rwlock);
> spin_lock(&v->vpci.pdev->vpci->lock);
> /* Disable memory decoding unconditionally on failure. */
> modify_decoding(v->vpci.pdev,
> rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
> !rc && v->vpci.rom_only);
> spin_unlock(&v->vpci.pdev->vpci->lock);
> + read_unlock(&v->domain->vpci_rwlock);
I think you likely want to expand the scope of the domain vpci lock in
order to also protect the data in v->vpci? So that vPCI device
removal can clear this data if required without racing with
vpci_process_pending(). Otherwise you need to pause the domain when
removing vPCI.
>
> rangeset_destroy(v->vpci.mem);
> v->vpci.mem = NULL;
> @@ -181,8 +183,20 @@ static int __init apply_map(struct domain *d, const
> struct pci_dev *pdev,
> struct map_data data = { .d = d, .map = true };
> int rc;
>
> + ASSERT(rw_is_write_locked(&d->vpci_rwlock));
> +
> while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) ==
> -ERESTART )
> + {
> + /*
> + * It's safe to drop and reacquire the lock in this context
> + * without risking pdev disappearing because devices cannot be
> + * removed until the initial domain has been started.
> + */
> + write_unlock(&d->vpci_rwlock);
> process_pending_softirqs();
> + write_lock(&d->vpci_rwlock);
> + }
> +
> rangeset_destroy(mem);
> if ( !rc )
> modify_decoding(pdev, cmd, false);
> @@ -213,6 +227,7 @@ static void defer_map(struct domain *d, struct pci_dev
> *pdev,
> raise_softirq(SCHEDULE_SOFTIRQ);
> }
>
> +/* This must hold domain's vpci_rwlock in write mode. */
Why it must be in write mode?
Is this to avoid ABBA deadlocks as not taking the per-domain lock in
write mode would then force us to take each pdev->vpci->lock in order
to prevent concurrent modifications of remote devices?
> static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool
> rom_only)
> {
> struct vpci_header *header = &pdev->vpci->header;
> @@ -287,6 +302,7 @@ static int modify_bars(const struct pci_dev *pdev,
> uint16_t cmd, bool rom_only)
> * Check for overlaps with other BARs. Note that only BARs that are
> * currently mapped (enabled) are checked for overlaps.
> */
> + pcidevs_lock();
This can be problematic I'm afraid, as it's a guest controlled path
that allows applying unrestricted contention on the pcidevs lock. I'm
unsure whether multiple guests could be able to trigger the watchdog
if given enough devices/vcpus to be able to perform enough
simultaneous calls to modify_bars().
Maybe you could expand the per-domain vpci lock to also protect
modifications of domain->pdev_list?
IOW: kind of a per-domain pdev_lock.
> for_each_pdev ( pdev->domain, tmp )
> {
> if ( tmp == pdev )
> @@ -326,10 +342,12 @@ static int modify_bars(const struct pci_dev *pdev,
> uint16_t cmd, bool rom_only)
> printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n",
> start, end, rc);
> rangeset_destroy(mem);
> + pcidevs_unlock();
> return rc;
> }
> }
> }
> + pcidevs_unlock();
>
> ASSERT(dev);
>
> @@ -482,6 +500,8 @@ static int cf_check init_bars(struct pci_dev *pdev)
> struct vpci_bar *bars = header->bars;
> int rc;
>
> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock));
> +
> switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
> {
> case PCI_HEADER_TYPE_NORMAL:
> @@ -570,6 +590,8 @@ static int cf_check init_bars(struct pci_dev *pdev)
> }
> }
>
> + ASSERT(!header->rom_reg);
> +
> /* Check expansion ROM. */
> rc = pci_size_mem_bar(pdev->sbdf, rom_reg, &addr, &size, PCI_BAR_ROM);
> if ( rc > 0 && size )
> @@ -586,12 +608,42 @@ static int cf_check init_bars(struct pci_dev *pdev)
> 4, rom);
> if ( rc )
> rom->type = VPCI_BAR_EMPTY;
> +
> + header->rom_reg = rom_reg;
> }
>
> return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, cmd, false) : 0;
> }
> REGISTER_VPCI_INIT(init_bars, VPCI_PRIORITY_MIDDLE);
>
> +static bool overlap(unsigned int r1_offset, unsigned int r1_size,
> + unsigned int r2_offset, unsigned int r2_size)
> +{
> + /* Return true if there is an overlap. */
> + return r1_offset < r2_offset + r2_size && r2_offset < r1_offset +
> r1_size;
Hm, we already have a vpci_register_cmp(), which does a very similar
comparison. I wonder if it would be possible to somehow use that
here.
> +}
> +
> +bool vpci_header_need_write_lock(const struct pci_dev *pdev,
> + unsigned int start, unsigned int size)
> +{
> + /*
> + * Writing the command register and ROM BAR register may trigger
> + * modify_bars to run, which in turn may access multiple pdevs while
> + * checking for the existing BAR's overlap. The overlapping check, if
> done
> + * under the read lock, requires vpci->lock to be acquired on both
> devices
> + * being compared, which may produce a deadlock. At the same time it is
> not
> + * possible to upgrade read lock to write lock in such a case.
> + * Check which registers are going to be written and return true if lock
> + * needs to be acquired in write mode.
> + */
> + if ( overlap(start, size, PCI_COMMAND, 2) ||
> + (pdev->vpci->header.rom_reg &&
> + overlap(start, size, pdev->vpci->header.rom_reg, 4)) )
> + return true;
> +
> + return false;
> +}
> +
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 8f2b59e61a..e7d1f916a0 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -190,6 +190,8 @@ static int cf_check init_msi(struct pci_dev *pdev)
> uint16_t control;
> int ret;
>
> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock));
> +
> if ( !pos )
> return 0;
>
> @@ -265,7 +267,7 @@ REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
>
> void vpci_dump_msi(void)
> {
> - const struct domain *d;
> + struct domain *d;
>
> rcu_read_lock(&domlist_read_lock);
> for_each_domain ( d )
> @@ -277,6 +279,15 @@ void vpci_dump_msi(void)
>
> printk("vPCI MSI/MSI-X d%d\n", d->domain_id);
>
> + if ( !read_trylock(&d->vpci_rwlock) )
> + continue;
> +
> + if ( !pcidevs_trylock() )
> + {
> + read_unlock(&d->vpci_rwlock);
> + continue;
> + }
> +
> for_each_pdev ( d, pdev )
> {
> const struct vpci_msi *msi;
> @@ -318,14 +329,22 @@ void vpci_dump_msi(void)
> * holding the lock.
> */
> printk("unable to print all MSI-X entries: %d\n", rc);
> - process_pending_softirqs();
> - continue;
> + goto pdev_done;
> }
> }
>
> spin_unlock(&pdev->vpci->lock);
> + pdev_done:
> + pcidevs_unlock();
> + read_unlock(&d->vpci_rwlock);
> +
> process_pending_softirqs();
> +
> + read_lock(&d->vpci_rwlock);
> + pcidevs_lock();
> }
> + pcidevs_unlock();
> + read_unlock(&d->vpci_rwlock);
> }
> rcu_read_unlock(&domlist_read_lock);
> }
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 25bde77586..b5a7dfdf9c 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -143,6 +143,7 @@ static void cf_check control_write(
> pci_conf_write16(pdev->sbdf, reg, val);
> }
>
> +/* This must hold domain's vpci_rwlock in write mode. */
> static struct vpci_msix *msix_find(const struct domain *d, unsigned long
> addr)
> {
> struct vpci_msix *msix;
> @@ -163,7 +164,13 @@ static struct vpci_msix *msix_find(const struct domain
> *d, unsigned long addr)
>
> static int cf_check msix_accept(struct vcpu *v, unsigned long addr)
> {
> - return !!msix_find(v->domain, addr);
> + int rc;
> +
> + read_lock(&v->domain->vpci_rwlock);
> + rc = !!msix_find(v->domain, addr);
> + read_unlock(&v->domain->vpci_rwlock);
> +
> + return rc;
> }
>
> static bool access_allowed(const struct pci_dev *pdev, unsigned long addr,
> @@ -358,21 +365,34 @@ static int adjacent_read(const struct domain *d, const
> struct vpci_msix *msix,
> static int cf_check msix_read(
> struct vcpu *v, unsigned long addr, unsigned int len, unsigned long
> *data)
> {
> - const struct domain *d = v->domain;
> - struct vpci_msix *msix = msix_find(d, addr);
> + struct domain *d = v->domain;
> + struct vpci_msix *msix;
> const struct vpci_msix_entry *entry;
> unsigned int offset;
>
> *data = ~0ul;
>
> + read_lock(&d->vpci_rwlock);
> +
> + msix = msix_find(d, addr);
> if ( !msix )
> + {
> + read_unlock(&d->vpci_rwlock);
> return X86EMUL_RETRY;
> + }
>
> if ( adjacent_handle(msix, addr) )
> - return adjacent_read(d, msix, addr, len, data);
> + {
> + int rc = adjacent_read(d, msix, addr, len, data);
> + read_unlock(&d->vpci_rwlock);
> + return rc;
> + }
>
> if ( !access_allowed(msix->pdev, addr, len) )
> + {
> + read_unlock(&d->vpci_rwlock);
> return X86EMUL_OKAY;
> + }
>
> spin_lock(&msix->pdev->vpci->lock);
> entry = get_entry(msix, addr);
> @@ -404,6 +424,7 @@ static int cf_check msix_read(
> break;
> }
> spin_unlock(&msix->pdev->vpci->lock);
> + read_unlock(&d->vpci_rwlock);
>
> return X86EMUL_OKAY;
> }
> @@ -491,19 +512,32 @@ static int adjacent_write(const struct domain *d, const
> struct vpci_msix *msix,
> static int cf_check msix_write(
> struct vcpu *v, unsigned long addr, unsigned int len, unsigned long data)
> {
> - const struct domain *d = v->domain;
> - struct vpci_msix *msix = msix_find(d, addr);
> + struct domain *d = v->domain;
> + struct vpci_msix *msix;
> struct vpci_msix_entry *entry;
> unsigned int offset;
>
> + read_lock(&d->vpci_rwlock);
> +
> + msix = msix_find(d, addr);
> if ( !msix )
> + {
> + read_unlock(&d->vpci_rwlock);
> return X86EMUL_RETRY;
> + }
>
> if ( adjacent_handle(msix, addr) )
> - return adjacent_write(d, msix, addr, len, data);
> + {
> + int rc = adjacent_write(d, msix, addr, len, data);
> + read_unlock(&d->vpci_rwlock);
> + return rc;
> + }
>
> if ( !access_allowed(msix->pdev, addr, len) )
> + {
> + read_unlock(&d->vpci_rwlock);
> return X86EMUL_OKAY;
> + }
>
> spin_lock(&msix->pdev->vpci->lock);
> entry = get_entry(msix, addr);
> @@ -579,6 +613,7 @@ static int cf_check msix_write(
> break;
> }
> spin_unlock(&msix->pdev->vpci->lock);
> + read_unlock(&d->vpci_rwlock);
>
> return X86EMUL_OKAY;
> }
> @@ -665,6 +700,8 @@ static int cf_check init_msix(struct pci_dev *pdev)
> struct vpci_msix *msix;
> int rc;
>
> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock));
> +
> msix_offset = pci_find_cap_offset(pdev->seg, pdev->bus, slot, func,
> PCI_CAP_ID_MSIX);
> if ( !msix_offset )
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 652807a4a4..1270174e78 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -38,20 +38,32 @@ extern vpci_register_init_t *const __end_vpci_array[];
>
> void vpci_remove_device(struct pci_dev *pdev)
> {
> - if ( !has_vpci(pdev->domain) || !pdev->vpci )
> + struct vpci *vpci;
> +
> + if ( !has_vpci(pdev->domain) )
> return;
>
> - spin_lock(&pdev->vpci->lock);
> + write_lock(&pdev->domain->vpci_rwlock);
> + if ( !pdev->vpci )
> + {
> + write_unlock(&pdev->domain->vpci_rwlock);
> + return;
> + }
> +
> + vpci = pdev->vpci;
> + pdev->vpci = NULL;
> + write_unlock(&pdev->domain->vpci_rwlock);
> +
> while ( !list_empty(&pdev->vpci->handlers) )
> {
> - struct vpci_register *r = list_first_entry(&pdev->vpci->handlers,
> + struct vpci_register *r = list_first_entry(&vpci->handlers,
> struct vpci_register,
> node);
>
> list_del(&r->node);
> xfree(r);
> }
> - spin_unlock(&pdev->vpci->lock);
> +
> if ( pdev->vpci->msix )
> {
> unsigned int i;
> @@ -61,29 +73,33 @@ void vpci_remove_device(struct pci_dev *pdev)
> if ( pdev->vpci->msix->table[i] )
> iounmap(pdev->vpci->msix->table[i]);
> }
> - xfree(pdev->vpci->msix);
> - xfree(pdev->vpci->msi);
> - xfree(pdev->vpci);
> - pdev->vpci = NULL;
> + xfree(vpci->msix);
> + xfree(vpci->msi);
> + xfree(vpci);
> }
>
> int vpci_add_handlers(struct pci_dev *pdev)
> {
> + struct vpci *vpci;
> unsigned int i;
> int rc = 0;
>
> if ( !has_vpci(pdev->domain) )
> return 0;
>
> + vpci = xzalloc(struct vpci);
> + if ( !vpci )
> + return -ENOMEM;
> +
> + INIT_LIST_HEAD(&vpci->handlers);
> + spin_lock_init(&vpci->lock);
> +
> + write_lock(&pdev->domain->vpci_rwlock);
I think the usage of the vpci per-domain lock here and in
vpci_remove_device() create an ABBA deadlock situation with the usage
of it in modify_bars().
Both vpci_add_handlers() and vpci_remove_device() get called with the
pcidevs lock taken, and take the per-domain vpci lock afterwards,
while modify_bars() does the inverse locking, gets called with the
per-domain vpci lock taken and then take the pcidevs lock inside the
function.
> +
> /* We should not get here twice for the same device. */
> ASSERT(!pdev->vpci);
>
> - pdev->vpci = xzalloc(struct vpci);
> - if ( !pdev->vpci )
> - return -ENOMEM;
> -
> - INIT_LIST_HEAD(&pdev->vpci->handlers);
> - spin_lock_init(&pdev->vpci->lock);
> + pdev->vpci = vpci;
>
> for ( i = 0; i < NUM_VPCI_INIT; i++ )
> {
> @@ -91,6 +107,7 @@ int vpci_add_handlers(struct pci_dev *pdev)
> if ( rc )
> break;
> }
> + write_unlock(&pdev->domain->vpci_rwlock);
>
> if ( rc )
> vpci_remove_device(pdev);
> @@ -139,6 +156,7 @@ uint32_t cf_check vpci_hw_read32(
> return pci_conf_read32(pdev->sbdf, reg);
> }
>
> +/* This must hold domain's vpci_rwlock in write mode. */
> int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
> vpci_write_t *write_handler, unsigned int offset,
> unsigned int size, void *data)
> @@ -162,8 +180,6 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t
> *read_handler,
> r->offset = offset;
> r->private = data;
>
> - spin_lock(&vpci->lock);
If you remove the lock here we need to assert that the per-domain vpci
lock is taken in write mode.
> -
> /* The list of handlers must be kept sorted at all times. */
> list_for_each ( prev, &vpci->handlers )
> {
> @@ -175,25 +191,23 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t
> *read_handler,
> break;
> if ( cmp == 0 )
> {
> - spin_unlock(&vpci->lock);
> xfree(r);
> return -EEXIST;
> }
> }
>
> list_add_tail(&r->node, prev);
> - spin_unlock(&vpci->lock);
>
> return 0;
> }
>
> +/* This must hold domain's vpci_rwlock in write mode. */
> int vpci_remove_register(struct vpci *vpci, unsigned int offset,
> unsigned int size)
> {
> const struct vpci_register r = { .offset = offset, .size = size };
> struct vpci_register *rm;
>
> - spin_lock(&vpci->lock);
> list_for_each_entry ( rm, &vpci->handlers, node )
> {
> int cmp = vpci_register_cmp(&r, rm);
> @@ -205,14 +219,12 @@ int vpci_remove_register(struct vpci *vpci, unsigned
> int offset,
> if ( !cmp && rm->offset == offset && rm->size == size )
> {
> list_del(&rm->node);
> - spin_unlock(&vpci->lock);
> xfree(rm);
> return 0;
> }
> if ( cmp <= 0 )
> break;
> }
> - spin_unlock(&vpci->lock);
Same here about the per-domain lock being taken.
>
> return -ENOENT;
> }
> @@ -320,7 +332,7 @@ static uint32_t merge_result(uint32_t data, uint32_t new,
> unsigned int size,
>
> uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size)
> {
> - const struct domain *d = current->domain;
> + struct domain *d = current->domain;
> const struct pci_dev *pdev;
> const struct vpci_register *r;
> unsigned int data_offset = 0;
> @@ -333,10 +345,18 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg,
> unsigned int size)
> }
>
> /* Find the PCI dev matching the address. */
> + pcidevs_lock();
> pdev = pci_get_pdev(d, sbdf);
> - if ( !pdev || !pdev->vpci )
> + pcidevs_unlock();
I think it's worth looking into expanding the per-domain vpci-lock to
a pdevs lock or similar in order to protect the pdev_list also if
possible. So that we can avoid taking the pcidevs lock here.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |