Xen project Mailing List

Re: [PATCH v4 1/2] x86/mm: add API for marking only part of a MMIO page read only

To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 11 Jun 2024 16:07:03 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Delivery-date: Tue, 11 Jun 2024 14:07:15 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Jun 11, 2024 at 03:15:42PM +0200, Marek Marczykowski-Górecki wrote: > On Tue, Jun 11, 2024 at 02:55:22PM +0200, Roger Pau Monné wrote: > > On Tue, Jun 11, 2024 at 01:38:35PM +0200, Marek Marczykowski-Górecki wrote: > > > On Tue, Jun 11, 2024 at 12:40:49PM +0200, Roger Pau Monné wrote: > > > > On Wed, May 22, 2024 at 05:39:03PM +0200, Marek Marczykowski-Górecki > > > > wrote: > > > > > + if ( !entry ) > > > > > + { > > > > > + /* iter == NULL marks it was a newly allocated entry */ > > > > > + iter = NULL; > > > > > + entry = xzalloc(struct subpage_ro_range); > > > > > + if ( !entry ) > > > > > + return -ENOMEM; > > > > > + entry->mfn = mfn; > > > > > + } > > > > > + > > > > > + for ( i = offset_s; i <= offset_e; i += MMIO_RO_SUBPAGE_GRAN ) > > > > > + { > > > > > + bool oldbit = __test_and_set_bit(i / MMIO_RO_SUBPAGE_GRAN, > > > > > + entry->ro_elems); > > > > > + ASSERT(!oldbit); > > > > > + } > > > > > + > > > > > + if ( !iter ) > > > > > + list_add(&entry->list, &subpage_ro_ranges); > > > > > + > > > > > + return iter ? 1 : 0; > > > > > +} > > > > > + > > > > > +/* This needs subpage_ro_lock already taken */ > > > > > +static void __init subpage_mmio_ro_remove_page( > > > > > + mfn_t mfn, > > > > > + unsigned int offset_s, > > > > > + unsigned int offset_e) > > > > > +{ > > > > > + struct subpage_ro_range *entry = NULL, *iter; > > > > > + unsigned int i; > > > > > + > > > > > + list_for_each_entry(iter, &subpage_ro_ranges, list) > > > > > + { > > > > > + if ( mfn_eq(iter->mfn, mfn) ) > > > > > + { > > > > > + entry = iter; > > > > > + break; > > > > > + } > > > > > + } > > > > > + if ( !entry ) > > > > > + return; > > > > > + > > > > > + for ( i = offset_s; i <= offset_e; i += MMIO_RO_SUBPAGE_GRAN ) > > > > > + __clear_bit(i / MMIO_RO_SUBPAGE_GRAN, entry->ro_elems); > > > > > + > > > > > + if ( !bitmap_empty(entry->ro_elems, PAGE_SIZE / > > > > > MMIO_RO_SUBPAGE_GRAN) ) > > > > > + return; > > > > > + > > > > > + list_del(&entry->list); > > > > > + if ( entry->mapped ) > > > > > + iounmap(entry->mapped); > > > > > + xfree(entry); > > > > > +} > > > > > + > > > > > +int __init subpage_mmio_ro_add( > > > > > + paddr_t start, > > > > > + size_t size) > > > > > +{ > > > > > + mfn_t mfn_start = maddr_to_mfn(start); > > > > > + paddr_t end = start + size - 1; > > > > > + mfn_t mfn_end = maddr_to_mfn(end); > > > > > + unsigned int offset_end = 0; > > > > > + int rc; > > > > > + bool subpage_start, subpage_end; > > > > > + > > > > > + ASSERT(IS_ALIGNED(start, MMIO_RO_SUBPAGE_GRAN)); > > > > > + ASSERT(IS_ALIGNED(size, MMIO_RO_SUBPAGE_GRAN)); > > > > > + if ( !IS_ALIGNED(size, MMIO_RO_SUBPAGE_GRAN) ) > > > > > + size = ROUNDUP(size, MMIO_RO_SUBPAGE_GRAN); > > > > > + > > > > > + if ( !size ) > > > > > + return 0; > > > > > + > > > > > + if ( mfn_eq(mfn_start, mfn_end) ) > > > > > + { > > > > > + /* Both starting and ending parts handled at once */ > > > > > + subpage_start = PAGE_OFFSET(start) || PAGE_OFFSET(end) != > > > > > PAGE_SIZE - 1; > > > > > + subpage_end = false; > > > > > > > > Given the intended usage of this, don't we want to limit to only a > > > > single page? So that PFN_DOWN(start + size) == PFN_DOWN/(start), as > > > > that would simplify the logic here? > > > > > > I have considered that, but I haven't found anything in the spec > > > mandating the XHCI DbC registers to not cross page boundary. Currently > > > (on a system I test this on) they don't cross page boundary, but I don't > > > want to assume extra constrains - to avoid issues like before (when > > > on the older system I tested the DbC registers didn't shared page with > > > other registers, but then they shared the page on a newer hardware). > > > > Oh, from our conversation at XenSummit I got the impression debug registers > > where always at the same position. Looking at patch 2/2, it seems you > > only need to block access to a single register. Are registers in XHCI > > size aligned? As this would guarantee it doesn't cross a page > > boundary (as long as the register is <= 4096 in size). > > It's a couple of registers (one "extended capability"), see > `struct dbc_reg` in xhci-dbc.c. That looks to be an awful lot of individual registers... > It's location is discovered at startup > (device presents a linked-list of capabilities in one of its BARs). > The spec talks only about alignment of individual registers, not the > whole group... Never mind then, I had the expectation we could get away with a single page, but doesn't look to be the case. I assume the spec doesn't mention anything about the BAR where the capabilities reside having a size <= 4KiB. > > > > > + if ( !addr ) > > > > > + { > > > > > + gprintk(XENLOG_ERR, > > > > > + "Failed to map page for MMIO write at > > > > > 0x%"PRI_mfn"%03x\n", > > > > > + mfn_x(mfn), offset); > > > > > + return; > > > > > + } > > > > > + > > > > > + switch ( len ) > > > > > + { > > > > > + case 1: > > > > > + writeb(*(const uint8_t*)data, addr); > > > > > + break; > > > > > + case 2: > > > > > + writew(*(const uint16_t*)data, addr); > > > > > + break; > > > > > + case 4: > > > > > + writel(*(const uint32_t*)data, addr); > > > > > + break; > > > > > + case 8: > > > > > + writeq(*(const uint64_t*)data, addr); > > > > > + break; > > > > > + default: > > > > > + /* mmio_ro_emulated_write() already validated the > > > > > size */ > > > > > + ASSERT_UNREACHABLE(); > > > > > + goto write_ignored; > > > > > + } > > > > > + return; > > > > > + } > > > > > + } > > > > > + /* Do not print message for pages without any writable parts. */ > > > > > +} > > > > > + > > > > > +#ifdef CONFIG_HVM > > > > > +bool subpage_mmio_write_accept(mfn_t mfn, unsigned long gla) > > > > > +{ > > > > > + unsigned int offset = PAGE_OFFSET(gla); > > > > > + const struct subpage_ro_range *entry; > > > > > + > > > > > + list_for_each_entry(entry, &subpage_ro_ranges, list) > > > > > + if ( mfn_eq(entry->mfn, mfn) && > > > > > + !test_bit(offset / MMIO_RO_SUBPAGE_GRAN, > > > > > entry->ro_elems) ) > > > > > + { > > > > > + /* > > > > > + * We don't know the write size at this point yet, so it > > > > > could be > > > > > + * an unaligned write, but accept it here anyway and > > > > > deal with it > > > > > + * later. > > > > > + */ > > > > > + return true; > > > > > > > > For accesses that fall into the RO region, I think you need to accept > > > > them here and just terminate them? I see no point in propagating > > > > them further in hvm_hap_nested_page_fault(). > > > > > > If write hits an R/O region on a page with some writable regions the > > > handling should be the same as it would be just on the mmio_ro_ranges. > > > This is what the patch does. > > > There may be an opportunity to simplify mmio_ro_ranges handling > > > somewhere, but I don't think it belongs to this patch. > > > > Maybe worth adding a comment that the logic here intends to deal only > > with the RW bits of a page that's otherwise RO, and that by not > > handling the RO regions the intention is that those are dealt just > > like fully RO pages. > > I can extend the comment, but I assumed it's kinda implied already (if > nothing else, by the function name). Well, at this point we know the write is not going to make it to host memory. The only reason to not handle the access here is that we want to unify the consequences it has for a guest writing to a RO address. > > I guess there's some message printed when attempting to write to a RO > > page that you would also like to print here? > > If a HVM domain writes to an R/O area, it is crashed, so you will get a > message. This applies to both full page R/O and partial R/O. PV doesn't > go through subpage_mmio_write_accept(). Oh, crashing the domain is more strict than I was expecting. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.