[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Writable page tables questions

On Mon, 2015-01-05 at 17:28 +0000, Andrew Cooper wrote:
> On 04/01/2015 17:17, Junji Zhi wrote:
> > Hi,
> >
> > I'm Junji, a newbie in Xen and hoping I can contribute to the
> > community one day. I have a few questions regarding the writable page
> > tables, while reading The Definitive Guide to the Xen Hypervisor by
> > David Chisnall:
> >
> > 1. Writable page tables is one Xen memory assist technique, applied to
> > paravirtualized guests ONLY. HVM does not apply. Correct?
> >
> > 2. According to the book, when a guest wants to modify its page table,
> > it triggers a trap into the hypervisor and it does a few steps:
> >
> > (1) it invalidates a PTE that points to the page containing the page
> > table. Is my understanding correct?
> >
> > Q: What does "invalidate" really mean here? Does it mean simply
> > flipping a bit in the PTE of the page table, or removing the PTE
> > completely?

At least clearing the present bit, what happens to the other bits in the
PTE is up to the implementation I think.

>  Does it also need to invalidate the TLB entry?

Yes, I think so, else the CPU might subsequently use a stale mapping.

> > (2) then the control goes back to the guest and it can write/read the
> > page table now.
> >
> > (3) The book's words pasted: "When an address referenced by the newly
> > invalidated page directory entry is referenced (read or write), a page
> > fault occurs. "
> >
> > Q: The description of step (3) is confusing. What does it mean by "an
> > address referenced by the newly invalidated page directory entry is
> > referenced"? Does it mean the case when the guest code is accessing an
> > virtual address that needs to search the invalidated page table for
> > translation?

Yes, it means when something tries to access memory which would have
been mapped by the PT page which was removed in (1).

> I do not have the Chisnall book to hand at the moment, so cannot comment
> as to the exact text in it.
> However, looking at the code as it exists today,
> XENFEAT_writable_page_tables (there is a typo in the ABI) is strictly
> only offered to HVM guests, and not to PV guests.

XENFEAT_writable_page_tables is different from "out of sync" PT updates,
which is what Junji (and the book) seems to be referring to.

I don't know if modern Xen still does this for PV (I think it still does
for shadow mode HVM under at least some circumstances) but at at one
point in time (presumably when the book was written) it used to be that
Xen would handle an emulated write to a r/o page table page by:
      * unhooking it from the higher level PTs which referenced it,
        flushing TLBs
      * map the PT page itself r/w (contrary to the usual invariant that
        it be mapped r/o, which is Xen's usual invariant)

At which point any subsequent writes to the now out-of-sync PT page can
just happen without trapping. This is safe because after the unhook the
PT is not part of any cr3 and the invariant is not violated (the guest
doesn't really know this is happening, for all it knows all writes are
still being emulated).

At some point something would try and access the memory which would be
mapped by the out of sync PT page and Xen will, in the page fault
      * make all the mappings r/o again (+ tlb flush)
      * validate all the entries in the page
      * rehook it into the higher level PTs which should reference it
At which point the mappings are available again and Xen's invariants are

The tlb flushes involved in the above are reasonably expensive, IIRC Xen
flip flopped a bit (years ago now) on whether it is worthwhile doing
this or not, which is why I'm not sure if it still does or not.
This is all different from XENFEAT_writable_page_tables that you talk
about which is where the guest is informed that it is not obliged to
make the regular mappings r/o in the first place, i.e. to ignore Xen's
invariant completely.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.