|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/hvm: Allow writes to registers on the same page as MSI-X table
On Tue, Mar 28, 2023 at 01:28:44PM +0200, Roger Pau Monné wrote:
> On Sat, Mar 25, 2023 at 03:49:23AM +0100, Marek Marczykowski-Górecki wrote:
> > Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
> > on the same page as MSI-X table. Device model (especially one in
> > stubdomain) cannot really handle those, as direct writes to that page is
> > refused (page is on the mmio_ro_ranges list). Instead, add internal ioreq
> > server that handle those writes.
> >
> > Doing this, requires correlating write location with guest view
> > of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
> > it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
> > for PV would need to be done separately.
> >
> > This can be also used to read Pending Bit Array, if it lives on the same
> > page, making QEMU not needing /dev/mem access at all (especially helpful
> > with lockdown enabled in dom0). If PBA lives on another page, QEMU will
> > map it to the guest directly.
> > If PBA lives on the same page, forbid writes and log a message.
> > Technically, writes outside of PBA could be allowed, but at this moment
> > the precise location of PBA isn't saved.
> >
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > ---
> > v2:
> > - adjust commit message
> > - pass struct domain to msixtbl_page_handler_get_hwaddr()
> > - reduce local variables used only once
> > - log a warning if write is forbidden if MSI-X and PBA lives on the same
> > page
> > - do not passthrough unaligned accesses
> > - handle accesses both before and after MSI-X table
> > ---
> > xen/arch/x86/hvm/vmsi.c | 154 +++++++++++++++++++++++++++++++++
> > xen/arch/x86/include/asm/msi.h | 5 ++
> > xen/arch/x86/msi.c | 38 ++++++++
> > 3 files changed, 197 insertions(+)
> >
> > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> > index 9c82bf9b4ec2..9293009a4075 100644
> > --- a/xen/arch/x86/hvm/vmsi.c
> > +++ b/xen/arch/x86/hvm/vmsi.c
> > @@ -438,6 +438,152 @@ static const struct hvm_io_ops msixtbl_mmio_ops = {
> > .write = _msixtbl_write,
> > };
> >
> > +static void __iomem *msixtbl_page_handler_get_hwaddr(
> > + const struct domain *d,
> > + uint64_t address,
> > + bool write)
> > +{
> > + const struct pci_dev *pdev = NULL;
> > + const struct msixtbl_entry *entry;
> > + int adj_type;
>
> unsigned AFAICT.
Ok.
> > +
> > + rcu_read_lock(&msixtbl_rcu_lock);
> > + /*
> > + * Check if it's on the same page as the end of the MSI-X table, but
> > + * outside of the table itself.
> > + */
> > + list_for_each_entry( entry, &d->arch.hvm.msixtbl_list, list )
> > + {
> > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable) &&
> > + address < entry->gtable )
> > + {
> > + adj_type = ADJ_IDX_FIRST;
> > + pdev = entry->pdev;
> > + break;
> > + }
> > + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable +
> > entry->table_len) &&
>
> Should be entry->gtable + entry->table_len - 1, or else you are
> off-by-one.
Ok.
> > + address >= entry->gtable + entry->table_len )
> > + {
> > + adj_type = ADJ_IDX_LAST;
> > + pdev = entry->pdev;
> > + break;
> > + }
> > + }
> > + rcu_read_unlock(&msixtbl_rcu_lock);
> > +
> > + if ( !pdev )
> > + return NULL;
> > +
> > + ASSERT(pdev->msix);
> > +
> > + if ( !pdev->msix->adj_access_table_idx[adj_type] )
> > + {
> > + gdprintk(XENLOG_WARNING,
> > + "Page for adjacent MSI-X table access not initialized for
> > %pp\n",
> > + pdev);
> > +
> > + return NULL;
> > + }
> > +
> > + /* If PBA lives on the same page too, forbid writes. */
> > + if ( write && (
> > + (adj_type == ADJ_IDX_LAST &&
> > + pdev->msix->table.last == pdev->msix->pba.first) ||
> > + (adj_type == ADJ_IDX_FIRST &&
> > + pdev->msix->table.first == pdev->msix->pba.last)) )
> > + {
> > + gdprintk(XENLOG_WARNING,
> > + "MSI-X table and PBA of %pp live on the same page, "
> > + "writing to other registers there is not implemented\n",
> > + pdev);
>
> Don't you actually need to check that the passed address falls into the
> PBA array? PBA can be sharing the same page as the MSI-X table, but
> that doesn't imply there aren't holes also not used by either the PBA
> or the MSI-X table in such page.
I don't know exact location of PBA, so I'm rejecting writes just in case
they do hit PBA (see commit message).
> > + return NULL;
> > + }
> > +
> > + return fix_to_virt(pdev->msix->adj_access_table_idx[adj_type]) +
> > + (address & (PAGE_SIZE - 1));
>
> You can use PAGE_OFFSET().
Ok.
> > +
> > +}
> > +
> > +static bool cf_check msixtbl_page_accept(
> > + const struct hvm_io_handler *handler, const ioreq_t *r)
> > +{
> > + ASSERT(r->type == IOREQ_TYPE_COPY);
> > +
> > + return msixtbl_page_handler_get_hwaddr(
> > + current->domain, r->addr, r->dir == IOREQ_WRITE);
>
> I think you want to accept it also if it's a write to the PBA, and
> just drop it. You should always pass write=false and then drop it in
> msixtbl_page_write() if it falls in the PBA region (but still return
> X86EMUL_OKAY).
I don't want to interfere with msixtbl_mmio_page_ops, this handler is
only about accesses not hitting actual MSI-X structures.
> > +}
> > +
> > +static int cf_check msixtbl_page_read(
> > + const struct hvm_io_handler *handler,
> > + uint64_t address, uint32_t len, uint64_t *pval)
>
> Why use hvm_io_ops and not hvm_mmio_ops? You only care about MMIO
> accesses here.
I followed how msixtbl_mmio_ops are registered. Should that also be
changed to hvm_mmio_ops?
>
> > +{
> > + void __iomem *hwaddr;
>
> const
>
> I would also initialize *pval = ~0ul for safety.
When returning X86EMUL_OKAY, then I agree.
> > +
> > + if ( address & (len - 1) )
> > + return X86EMUL_UNHANDLEABLE;
>
> You can use IS_ALIGNED for clarity. In my fix for this for vPCI Jan
> asked to split unaligned accesses into 1byte ones, but I think for
> guests it's better to just drop them unless there's a specific case
> that we need to handle.
>
> Also you should return X86EMUL_OKAY here, as the address is handled
> here, but the current way to handle it is to drop the access.
>
> Printing a debug message can be helpful in case unaligned accesses
> need to be implemented in order to support some device.
>
> > +
> > + hwaddr = msixtbl_page_handler_get_hwaddr(
> > + current->domain, address, false);
> > +
> > + if ( !hwaddr )
> > + return X86EMUL_UNHANDLEABLE;
>
> Maybe X86EMUL_RETRY, since the page was there in the accept handler.
How does X86EMUL_RETRY work? Is it trying to find the handler again?
> > +
> > + switch ( len )
> > + {
> > + case 1:
> > + *pval = readb(hwaddr);
> > + break;
> > + case 2:
> > + *pval = readw(hwaddr);
> > + break;
> > + case 4:
> > + *pval = readl(hwaddr);
> > + break;
> > + case 8:
> > + *pval = readq(hwaddr);
> > + break;
>
> Nit: we usually add a newline after every break;
Ok.
> > + default:
> > + return X86EMUL_UNHANDLEABLE;
>
> I would rather use ASSERT_UNREACHABLE() here and fallthrough to the
> return below. Seeing such size is likely an indication of something
> else gone very wrong, better to just terminate the access at once.
>
> > + }
> > + return X86EMUL_OKAY;
> > +}
> > +
> > +static int cf_check msixtbl_page_write(
> > + const struct hvm_io_handler *handler,
> > + uint64_t address, uint32_t len, uint64_t val)
> > +{
> > + void __iomem *hwaddr = msixtbl_page_handler_get_hwaddr(
> > + current->domain, address, true);
> > +
>
> You don't seem to check whether the access is aligned here?
>
> Otherwise I have mostly the same comments as in msixtbl_page_read().
Ok.
> > + if ( !hwaddr )
> > + return X86EMUL_UNHANDLEABLE;
> > +
> > + switch ( len ) {
> > + case 1:
> > + writeb(val, hwaddr);
> > + break;
> > + case 2:
> > + writew(val, hwaddr);
> > + break;
> > + case 4:
> > + writel(val, hwaddr);
> > + break;
> > + case 8:
> > + writeq(val, hwaddr);
> > + break;
> > + default:
> > + return X86EMUL_UNHANDLEABLE;
> > + }
> > + return X86EMUL_OKAY;
> > +
> > +}
> > +
> > +static const struct hvm_io_ops msixtbl_mmio_page_ops = {
> > + .accept = msixtbl_page_accept,
> > + .read = msixtbl_page_read,
> > + .write = msixtbl_page_write,
> > +};
> > +
> > static void add_msixtbl_entry(struct domain *d,
> > struct pci_dev *pdev,
> > uint64_t gtable,
> > @@ -593,6 +739,14 @@ void msixtbl_init(struct domain *d)
> > handler->type = IOREQ_TYPE_COPY;
> > handler->ops = &msixtbl_mmio_ops;
> > }
> > +
> > + /* passthrough access to other registers on the same page */
> > + handler = hvm_next_io_handler(d);
> > + if ( handler )
> > + {
> > + handler->type = IOREQ_TYPE_COPY;
> > + handler->ops = &msixtbl_mmio_page_ops;
> > + }
> > }
> >
> > void msixtbl_pt_cleanup(struct domain *d)
> > diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h
> > index a53ade95c9ad..d13cf1c1f873 100644
> > --- a/xen/arch/x86/include/asm/msi.h
> > +++ b/xen/arch/x86/include/asm/msi.h
> > @@ -207,6 +207,10 @@ struct msg_address {
> > PCI_MSIX_ENTRY_SIZE + \
> > (~PCI_MSIX_BIRMASK & (PAGE_SIZE -
> > 1)))
> >
> > +/* indexes in adj_access_table_idx[] below */
> > +#define ADJ_IDX_FIRST 0
> > +#define ADJ_IDX_LAST 1
> > +
> > struct arch_msix {
> > unsigned int nr_entries, used_entries;
> > struct {
> > @@ -214,6 +218,7 @@ struct arch_msix {
> > } table, pba;
> > int table_refcnt[MAX_MSIX_TABLE_PAGES];
> > int table_idx[MAX_MSIX_TABLE_PAGES];
> > + int adj_access_table_idx[2];
> > spinlock_t table_lock;
> > bool host_maskall, guest_maskall;
> > domid_t warned;
> > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> > index d0bf63df1def..680853f84685 100644
> > --- a/xen/arch/x86/msi.c
> > +++ b/xen/arch/x86/msi.c
> > @@ -961,6 +961,34 @@ static int msix_capability_init(struct pci_dev *dev,
> > domain_crash(d);
> > /* XXX How to deal with existing mappings? */
> > }
> > +
> > + /*
> > + * If the MSI-X table doesn't start at the page boundary, map the
> > first page for
> > + * passthrough accesses.
> > + */
>
> I think you should initialize
> msix->adj_access_table_idx[ADJ_IDX_{FIRST,LAST}] to -1?
>
> > + if ( table_paddr & (PAGE_SIZE - 1) )
> > + {
> > + int idx = msix_get_fixmap(msix, table_paddr, table_paddr);
> > +
> > + if ( idx >= 0 )
> > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = idx;
> > + else
> > + gprintk(XENLOG_ERR, "Failed to map first MSI-X table page:
> > %d\n", idx);
> > + }
> > + /*
> > + * If the MSI-X table doesn't span full page(s), map the last page
> > for
> > + * passthrough accesses.
> > + */
> > + if ( (table_paddr + msix->nr_entries * PCI_MSIX_ENTRY_SIZE) &
> > (PAGE_SIZE - 1) )
> > + {
> > + uint64_t entry_paddr = table_paddr + msix->nr_entries *
> > PCI_MSIX_ENTRY_SIZE;
> > + int idx = msix_get_fixmap(msix, table_paddr, entry_paddr);
> > +
> > + if ( idx >= 0 )
> > + msix->adj_access_table_idx[ADJ_IDX_LAST] = idx;
> > + else
> > + gprintk(XENLOG_ERR, "Failed to map last MSI-X table page:
> > %d\n", idx);
> > + }
> > }
> > WARN_ON(msix->table.first != (table_paddr >> PAGE_SHIFT));
> > ++msix->used_entries;
> > @@ -1090,6 +1118,16 @@ static void _pci_cleanup_msix(struct arch_msix *msix)
> > WARN();
> > msix->table.first = 0;
> > msix->table.last = 0;
> > + if ( msix->adj_access_table_idx[ADJ_IDX_FIRST] )
> > + {
> > + msix_put_fixmap(msix,
> > msix->adj_access_table_idx[ADJ_IDX_FIRST]);
> > + msix->adj_access_table_idx[ADJ_IDX_FIRST] = 0;
> > + }
> > + if ( msix->adj_access_table_idx[ADJ_IDX_LAST] )
> > + {
> > + msix_put_fixmap(msix,
> > msix->adj_access_table_idx[ADJ_IDX_LAST]);
> > + msix->adj_access_table_idx[ADJ_IDX_LAST] = 0;
>
> Isn't 0 a valid idx? You should check for >= 0 and also set
> to -1 once the fixmap entry has been put.
I rely here on msix using specific range of fixmap indexes
(FIX_MSIX_TO_RESERV_BASE - FIX_MSIX_TO_RESERV_END), which isn't starting
at 0. For this reason also, I keep unused entries at 0 (no need explicit
initialization).
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |