[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] [RFC] vpci: allow BAR write while mapped
On 3/13/25 13:48, Roger Pau Monné wrote: > On Wed, Mar 12, 2025 at 03:50:17PM -0400, Stewart Hildebrand wrote: >> Xen vPCI refuses BAR writes if the BAR is mapped in p2m. If firmware >> initialized the BAR to a bad address, Linux will try to write a new >> address to the BAR without disabling memory decoding. Allow the write >> by updating p2m right away in the vPCI BAR write handler. > > IIRC it's only 32bit BARs that Linux will attempt to reposition > without toggling memory decoding off. This matches my observations as well. > For 64bit BARs it will in > general (unless pci_dev->mmio_always_on is set) toggle memory decoding > off and then update the BAR registers. > >> >> Resolves: https://gitlab.com/xen-project/xen/-/issues/197 >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> >> --- >> RFC: Currently the deferred mapping machinery supports only map or >> unmap, not both. It might be better to rework the mapping machinery >> to support unmap-then-map operations, but please let me know your >> thoughts. >> RFC: This patch has not yet made an attempt to distinguish between >> 32-bit and 64-bit writes. It probably should. >> --- >> xen/drivers/vpci/header.c | 65 +++++++++++++++++++++++++++++++-------- >> 1 file changed, 53 insertions(+), 12 deletions(-) >> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c >> index ef6c965c081c..66adb2183cfe 100644 >> --- a/xen/drivers/vpci/header.c >> +++ b/xen/drivers/vpci/header.c >> @@ -173,7 +173,7 @@ static void modify_decoding(const struct pci_dev *pdev, >> uint16_t cmd, >> ASSERT_UNREACHABLE(); >> } >> >> -bool vpci_process_pending(struct vcpu *v) >> +static bool process_pending(struct vcpu *v, bool need_lock) >> { >> struct pci_dev *pdev = v->vpci.pdev; >> struct vpci_header *header = NULL; >> @@ -182,12 +182,14 @@ bool vpci_process_pending(struct vcpu *v) >> if ( !pdev ) >> return false; >> >> - read_lock(&v->domain->pci_lock); >> + if ( need_lock ) >> + read_lock(&v->domain->pci_lock); > > The addition of need_lock would better be done in a pre-patch. > >> >> if ( !pdev->vpci || (v->domain != pdev->domain) ) >> { >> v->vpci.pdev = NULL; >> - read_unlock(&v->domain->pci_lock); >> + if ( need_lock ) >> + read_unlock(&v->domain->pci_lock); >> return false; >> } >> >> @@ -209,17 +211,20 @@ bool vpci_process_pending(struct vcpu *v) >> >> if ( rc == -ERESTART ) >> { >> - read_unlock(&v->domain->pci_lock); >> + if ( need_lock ) >> + read_unlock(&v->domain->pci_lock); >> return true; >> } >> >> if ( rc ) >> { >> - spin_lock(&pdev->vpci->lock); >> + if ( need_lock ) >> + spin_lock(&pdev->vpci->lock); >> /* Disable memory decoding unconditionally on failure. */ >> modify_decoding(pdev, v->vpci.cmd & ~PCI_COMMAND_MEMORY, >> false); >> - spin_unlock(&pdev->vpci->lock); >> + if ( need_lock ) >> + spin_unlock(&pdev->vpci->lock); >> >> /* Clean all the rangesets */ >> for ( i = 0; i < ARRAY_SIZE(header->bars); i++ ) >> @@ -228,7 +233,8 @@ bool vpci_process_pending(struct vcpu *v) >> >> v->vpci.pdev = NULL; >> >> - read_unlock(&v->domain->pci_lock); >> + if ( need_lock ) >> + read_unlock(&v->domain->pci_lock); >> >> if ( !is_hardware_domain(v->domain) ) >> domain_crash(v->domain); >> @@ -238,15 +244,23 @@ bool vpci_process_pending(struct vcpu *v) >> } >> v->vpci.pdev = NULL; >> >> - spin_lock(&pdev->vpci->lock); >> + if ( need_lock ) >> + spin_lock(&pdev->vpci->lock); >> modify_decoding(pdev, v->vpci.cmd, v->vpci.rom_only); >> - spin_unlock(&pdev->vpci->lock); >> + if ( need_lock ) >> + spin_unlock(&pdev->vpci->lock); >> >> - read_unlock(&v->domain->pci_lock); >> + if ( need_lock ) >> + read_unlock(&v->domain->pci_lock); >> >> return false; >> } >> >> +bool vpci_process_pending(struct vcpu *v) >> +{ >> + return process_pending(v, true); >> +} >> + >> static int __init apply_map(struct domain *d, const struct pci_dev *pdev, >> uint16_t cmd) >> { >> @@ -565,6 +579,8 @@ static void cf_check bar_write( >> { >> struct vpci_bar *bar = data; >> bool hi = false; >> + bool reenable = false; >> + uint32_t cmd = 0; >> >> ASSERT(is_hardware_domain(pdev->domain)); >> >> @@ -585,10 +601,31 @@ static void cf_check bar_write( >> { >> /* If the value written is the current one avoid printing a >> warning. */ >> if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) ) >> + { >> gprintk(XENLOG_WARNING, >> - "%pp: ignored BAR %zu write while mapped\n", >> + "%pp: allowing BAR %zu write while mapped\n", >> &pdev->sbdf, bar - pdev->vpci->header.bars + hi); > > If Xen now handles BARs writes with memory decoding enabled the > message can be removed. It's only purpose was to signal this missing > handling. OK >> - return; >> + ASSERT(rw_is_write_locked(&pdev->domain->pci_lock)); >> + ASSERT(spin_is_locked(&pdev->vpci->lock)); >> + reenable = true; >> + cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND); >> + /* >> + * Write-while-mapped: unmap the old BAR in p2m. We want this to >> + * finish right away since the deferral machinery only supports >> + * unmap OR map, not unmap-then-remap. Ultimately, it probably >> would >> + * be better to defer the write-while-mapped case just like >> regular >> + * BAR writes (but still only allow it for 32-bit BAR writes). >> + */ >> + /* Disable memory decoding */ >> + modify_bars(pdev, cmd & ~PCI_COMMAND_MEMORY, false); > > I think if the guest explicitly avoids toggling memory decoding Xen > should also to the same, and not touch the bit. OK >> + /* Call process pending here to ensure P2M operations are done >> */ >> + while ( process_pending(current, false) ) >> + { >> + /* Pre-empted, try again */ >> + } > > I'm afraid this is not how I would expect this to be done. We > explicitly do the p2m modifications in a deferred context to avoid > long running operations. We should continue to do so to perform this > unmap + map operation. > > I think you need to introduce a way to queue an operation that will do > a map + unmap in the deferred context processing, or signal that after > the currently queued operation finishes a new call to modify_bars() > should be issued. Yep, this makes sense. Will do. > > It would be nice if we had a more generic way to queue guest vCPU p2m > (map and unmap) operations, but that's likely to require a much better > interface than what we currently have. > > Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |