|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 3/7] x86/hvm: Allow access to registers on the same page as MSI-X table
On 13.03.2024 16:16, Marek Marczykowski-Górecki wrote:
> Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
> on the same page as MSI-X table. Device model (especially one in
> stubdomain) cannot really handle those, as direct writes to that page is
> refused (page is on the mmio_ro_ranges list). Instead, extend
> msixtbl_mmio_ops to handle such accesses too.
>
> Doing this, requires correlating read/write location with guest
> of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
> it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
> for PV would need to be done separately.
>
> This will be also used to read Pending Bit Array, if it lives on the same
> page, making QEMU not needing /dev/mem access at all (especially helpful
> with lockdown enabled in dom0). If PBA lives on another page, QEMU will
> map it to the guest directly.
> If PBA lives on the same page, discard writes and log a message.
> Technically, writes outside of PBA could be allowed, but at this moment
> the precise location of PBA isn't saved, and also no known device abuses
> the spec in this way (at least yet).
>
> To access those registers, msixtbl_mmio_ops need the relevant page
> mapped. MSI handling already has infrastructure for that, using fixmap,
> so try to map first/last page of the MSI-X table (if necessary) and save
> their fixmap indexes. Note that msix_get_fixmap() does reference
> counting and reuses existing mapping, so just call it directly, even if
> the page was mapped before. Also, it uses a specific range of fixmap
> indexes which doesn't include 0, so use 0 as default ("not mapped")
> value - which simplifies code a bit.
>
> GCC 12.2.1 gets confused about 'desc' variable:
>
> arch/x86/hvm/vmsi.c: In function ‘msixtbl_range’:
> arch/x86/hvm/vmsi.c:553:8: error: ‘desc’ may be used uninitialized
> [-Werror=maybe-uninitialized]
> 553 | if ( desc )
> | ^
> arch/x86/hvm/vmsi.c:537:28: note: ‘desc’ was declared here
> 537 | const struct msi_desc *desc;
> | ^~~~
>
> It's conditional initialization is actually correct (in the case where
> it isn't initialized, function returns early), but to avoid
> build failure initialize it explicitly to NULL anyway.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
Sadly there are further more or less cosmetic issues. Plus, as indicated
before, I'm not really happy for us to gain all of this extra code. In
the end I may eventually give an R-b not including the usually implied
A-b, to indicate the code (then) looks okay to me but I still want
someone else to actually ack it to allow it going in.
> +static int adjacent_read(
> + unsigned int fixmap_idx,
> + paddr_t address, unsigned int len, uint64_t *pval)
> +{
> + const void __iomem *hwaddr;
> +
> + *pval = ~0UL;
> +
> + ASSERT(fixmap_idx != ADJACENT_DISCARD_WRITE);
Why only one of the special values? And before you add the other here:
Why not simply ASSERT(fixmap_idx <= FIX_MSIX_IO_RESERV_END)? (Could of
course bound at the other end, too, i.e. against FIX_MSIX_IO_RESERV_BASE.)
> + hwaddr = fix_to_virt(fixmap_idx) + PAGE_OFFSET(address);
> +
> + switch ( len )
> + {
> + case 1:
> + *pval = readb(hwaddr);
> + break;
> +
> + case 2:
> + *pval = readw(hwaddr);
> + break;
> +
> + case 4:
> + *pval = readl(hwaddr);
> + break;
> +
> + case 8:
> + *pval = readq(hwaddr);
> + break;
> +
> + default:
> + ASSERT_UNREACHABLE();
Misra demands "break;" to be here for release builds. In fact I wonder
why "*pval = ~0UL;" isn't put here, too. Question of course is whether
in such a case a true error indicator wouldn't be yet better.
> + }
> + return X86EMUL_OKAY;
> +}
Like in adjacent_handle(): Blank line please ahead of the function's main
"return".
> +static int adjacent_write(
> + unsigned int fixmap_idx,
> + paddr_t address, unsigned int len, uint64_t val)
> +{
> + void __iomem *hwaddr;
> +
> + if ( fixmap_idx == ADJACENT_DISCARD_WRITE )
> + return X86EMUL_OKAY;
Similar assert as suggested above below here then, too?
> @@ -622,12 +808,15 @@ void msix_write_completion(struct vcpu *v)
> v->arch.hvm.hvm_io.msix_snoop_gpa )
> {
> unsigned int token = hvmemul_cache_disable(v);
> - const struct msi_desc *desc;
> + const struct msi_desc *desc = NULL;
> + const struct msixtbl_entry *entry;
> uint32_t data;
>
> rcu_read_lock(&msixtbl_rcu_lock);
> - desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, snoop_addr),
> - snoop_addr);
> + entry = msixtbl_find_entry(v, snoop_addr);
> + if ( entry && snoop_addr >= entry->gtable &&
> + snoop_addr < entry->gtable + entry->table_len )
Nit: Unexpected / unusual indentation. If you really want the two snoop_addr
to line up, then
if ( entry &&
snoop_addr >= entry->gtable &&
snoop_addr < entry->gtable + entry->table_len )
> --- a/xen/arch/x86/include/asm/msi.h
> +++ b/xen/arch/x86/include/asm/msi.h
> @@ -208,6 +208,10 @@ struct msg_address {
> PCI_MSIX_ENTRY_SIZE + \
> (~PCI_MSIX_BIRMASK & (PAGE_SIZE - 1)))
>
> +/* Indexes in adj_access_idx[] below */
> +#define ADJ_IDX_FIRST 0
> +#define ADJ_IDX_LAST 1
These may better live ...
> @@ -215,6 +219,7 @@ struct arch_msix {
> } table, pba;
> int table_refcnt[MAX_MSIX_TABLE_PAGES];
> int table_idx[MAX_MSIX_TABLE_PAGES];
> + unsigned int adj_access_idx[2];
... right next to this.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |