Xen project Mailing List

Re: [PATCH] vpci: introduce per-domain lock to protect vpci structure

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>

Date: Mon, 14 Feb 2022 11:37:12 +0000

Accept-language: en-US

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=BiGEzq9pRHRTrpgjt37xa+q5mYAKxcH1FrQrR3MGeZo=; b=gqyTnIKsQJPlNgoo+ydzLZ1t8Gf/7UPbhJ/tSGc6aH+fqZlwSJo7atuJfpxw7iZvyGAXNsdvA7cRFKMx7Voxayt6hX536fVultnEE3qjFnjN4mi0coAHEiZxFEzCvQWoXp1HpnQloUIkVqT2yWgVq89Dg8XgmoMDJkYma/bxilMtndAc4Efj7CM61ep5cdXdjEtdin4RziJIEzo8TT6QGljkeJ9d8bhv4kY6Umz+1Eq7S6VG/qVIZ74rKJuWO0f1vzpxFhbnqn5xthRQnPFBdH99xj3EfaQ194aAEzlszkrXB3zOsG4gRHP2AKasZFGAsnT/OH9e8FL1IrrTJlalmg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JR7RD0yh1xy1dLcfaWMc+c9sBkcLMNXpw89bpROvGP6GmY6Dc6iW/vZRKPkU2adRsCJV8bFUb2FvPc/Hjqecl44ZvB3B/Y3joKxtNbeEinssHJH1sO/NGlQBeydrnCucrxeDScHvFr+gTW+JFeCrl5aUYbdbq9pN0VKCJfUief/5/L/yEWJz3J7XJmnnWQ1UoZFw0GLZ4T4ldCYdOhsiK9b3q+oLa2g18jt5c6MlXaNaxpt4G6EtPV+Ps9BvLUx4LFlyIDaBOuaCPGbk7EUi8eQD27Y8shc9P/Le93sN8lp8axY3cDogcMfctGgLHgvn2B/uqXdrY4cmj19tzD7jgw==

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>

Delivery-date: Mon, 14 Feb 2022 11:37:29 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHYHboQT3cBWYI1/EGunE7hOwop6ayM95MAgAD+qQCAAEaDAIAElIcAgAAQEwCAAAV1AIAABOQAgAABMICAAALZgIAAAzmA

Thread-topic: [PATCH] vpci: introduce per-domain lock to protect vpci structure

On 14.02.22 13:25, Roger Pau Monné wrote: > On Mon, Feb 14, 2022 at 11:15:27AM +0000, Oleksandr Andrushchenko wrote: >> >> On 14.02.22 13:11, Roger Pau Monné wrote: >>> On Mon, Feb 14, 2022 at 10:53:43AM +0000, Oleksandr Andrushchenko wrote: >>>> On 14.02.22 12:34, Roger Pau Monné wrote: >>>>> On Mon, Feb 14, 2022 at 09:36:39AM +0000, Oleksandr Andrushchenko wrote: >>>>>> On 11.02.22 13:40, Roger Pau Monné wrote: >>>>>>> + >>>>>>>>>> for ( i = 0; i < msix->max_entries; i++ ) >>>>>>>>>> { >>>>>>>>>> const struct vpci_msix_entry *entry = >>>>>>>>>> &msix->entries[i]; >>>>>>>>> Since this function is now called with the per-domain rwlock read >>>>>>>>> locked it's likely not appropriate to call process_pending_softirqs >>>>>>>>> while holding such lock (check below). >>>>>>>> You are right, as it is possible that: >>>>>>>> >>>>>>>> process_pending_softirqs -> vpci_process_pending -> read_lock >>>>>>>> >>>>>>>> Even more, vpci_process_pending may also >>>>>>>> >>>>>>>> read_unlock -> vpci_remove_device -> write_lock >>>>>>>> >>>>>>>> in its error path. So, any invocation of process_pending_softirqs >>>>>>>> must not hold d->vpci_rwlock at least. >>>>>>>> >>>>>>>> And also we need to check that pdev->vpci was not removed >>>>>>>> in between or *re-created* >>>>>>>>> We will likely need to re-iterate over the list of pdevs assigned to >>>>>>>>> the domain and assert that the pdev is still assigned to the same >>>>>>>>> domain. >>>>>>>> So, do you mean a pattern like the below should be used at all >>>>>>>> places where we need to call process_pending_softirqs? >>>>>>>> >>>>>>>> read_unlock >>>>>>>> process_pending_softirqs >>>>>>>> read_lock >>>>>>>> pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn); >>>>>>>> if ( pdev && pdev->vpci && is_the_same_vpci(pdev->vpci) ) >>>>>>>> <continue processing> >>>>>>> Something along those lines. You likely need to continue iterate using >>>>>>> for_each_pdev. >>>>>> How do we tell if pdev->vpci is the same? Jan has already brought >>>>>> this question before [1] and I was about to use some ID for that purpose: >>>>>> pdev->vpci->id = d->vpci_id++ and then we use pdev->vpci->id for checks >>>>> Given this is a debug message I would be OK with just doing the >>>>> minimal checks to prevent Xen from crashing (ie: pdev->vpci exists) >>>>> and that the resume MSI entry is not past the current limit. Otherwise >>>>> just print a message and move on to the next device. >>>> Agree, I see no big issue (probably) if we are not able to print >>>> >>>> How about this one: >>>> >>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c >>>> index 809a6b4773e1..50373f04da82 100644 >>>> --- a/xen/drivers/vpci/header.c >>>> +++ b/xen/drivers/vpci/header.c >>>> @@ -171,10 +171,31 @@ static int __init apply_map(struct domain *d, const >>>> struct pci_dev *pdev, >>>> struct rangeset *mem, uint16_t cmd) >>>> { >>>> struct map_data data = { .d = d, .map = true }; >>>> + pci_sbdf_t sbdf = pdev->sbdf; >>>> int rc; >>>> >>>> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock)); >>>> + >>>> while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == >>>> -ERESTART ) >>>> + { >>>> + >>>> + /* >>>> + * process_pending_softirqs may trigger vpci_process_pending which >>>> + * may need to acquire pdev->domain->vpci_rwlock in read mode. >>>> + */ >>>> + write_unlock(&pdev->domain->vpci_rwlock); >>>> process_pending_softirqs(); >>>> + write_lock(&pdev->domain->vpci_rwlock); >>>> + >>>> + /* Check if pdev still exists and vPCI was not removed or >>>> re-created. */ >>>> + if (pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn) != >>>> pdev) >>>> + if ( vpci is NOT the same ) >>>> + { >>>> + rc = 0; >>>> + break; >>>> + } >>>> + } >>>> + >>>> rangeset_destroy(mem); >>>> if ( !rc ) >>>> modify_decoding(pdev, cmd, false); >>>> >>>> This one also wants process_pending_softirqs to run so it *might* >>>> want pdev and vpci checks. But at the same time apply_map runs >>>> at ( system_state < SYS_STATE_active ), so defer_map won't be >>>> running yet, thus no vpci_process_pending is possible yet (in terms >>>> it has something to do yet). So, I think we just need: >>>> >>>> write_unlock(&pdev->domain->vpci_rwlock); >>>> process_pending_softirqs(); >>>> write_lock(&pdev->domain->vpci_rwlock); >>>> >>>> and this should be enough >>> Given the context apply_map is called from (dom0 specific init code), >>> there's no need to check for the pdev to still exits, or whether vpci >>> has been recreated, as it's not possible. Just add a comment to >>> explicitly note that the context of the function is special, and thus >>> there's no possibility of either the device or vpci going away. >> Does it really need write_unlock/write_lock given the context?... > I think it's bad practice to call process_pending_softirqs while > holding any locks. This is a very specific context so it's likely fine > to not drop the lock, but would still seem incorrect to me. Ok > >> I think it doesn't as there is no chance defer_map is called, thus >> process_pending_softirqs -> vpci_process_pending -> read_lock > Indeed, there's no chance of that because process_pending_softirqs > will never try to do a scheduling operation that would result in our > context being scheduled out. while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == -ERESTART ) { /* * FIXME: Given the context apply_map is called from (dom0 specific * init code at system_state < SYS_STATE_active) it is not strictly * required that pdev->domain->vpci_rwlock is unlocked before calling * process_pending_softirqs as there is no contention possible between * this code and vpci_process_pending trying to acquire the lock in * read mode. But running process_pending_softirqs with any lock held * doesn't seem to be a good practice, so drop the lock and re-acquire * it right again. */ write_unlock(&pdev->domain->vpci_rwlock); process_pending_softirqs(); write_lock(&pdev->domain->vpci_rwlock); } Will this be good enough? > > Thanks, Roger. > Thank you, Oleksandr

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.