[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
>>> On 21.01.16 at 14:17, <andrew.cooper3@xxxxxxxxxx> wrote: > On 21/01/16 12:59, Jan Beulich wrote: >>> >>>>> For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug >>>>> build (so there is a guest observable difference between running on a >>>>> debug and a non-debug Xen), and the comment beside it even identifies >>>>> that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE >>>>> is set. Currently this means that we are not able to support Protection >>>>> Key for PV guests (although this restriction technically only applies to >>>>> debug builds of the hypervisor). >>>>> >>>>> The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit >>>>> is used to notice when a 64bit PV guest attempts to override the fixup >>>>> Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user >>>>> pages, and clears _PAGE_GLOBAL for supervisor mappings, setting >>>>> _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing >>>>> _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately >>>>> tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a >>>>> warning is logged and the kernel overridden. >>>>> >>>>> >>>>> Neither of the used PTE bits exist in the Xen public ABI. Neither of >>>>> them serve a purpose other than a debugging aid. >>>>> >>>>> I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of >>>>> "all bits available for guest use". >>>> And a kernel using any of the conflicting bits would then become >>>> unusable on a hypervisor with that debug option enabled? I'd >>>> rather see us document the state things are in... >>> _PAGE_GNTMAP is already states: >>> >>> /* >>> * Debug option: Ensure that granted mappings are not implicitly unmapped. >>> * WARNING: This will need to be disabled to run OSes that use the spare PTE >>> * bits themselves (e.g., *BSD). >>> */ >> But that's (assuming the use of the two bits were spelled out) a >> guest OS not fully playing by the spec. To me, "available" PTE bits >> being shared implies that some of them may be claimed by Xen, >> while others may be claimed by guests. You're right that this needs >> to be written down, but I don't think we need to go as far as >> forbidding Xen to use any of them. And even less so should we >> preclude their use for any purpose going forward. > > I do not want to make an ABI which mandates the use of certain bits by Xen. Agreed, but that doesn't match what I said: We also shouldn't make an ABI precluding use of some of the bits by Xen. >> In the end, with (as it seems) not much effort this could even be >> made dynamic: A guest could advertise which of the bits it doesn't >> use, and then Xen could pick two of them for the two purposes it >> currently needs them for. Should a guest leave no or just one bit >> available, the debugging aid could then be disabled. > > This is unnecessarily complicated. It only helps going forward, and > leaves Xen with substantially more complicated PTE handling. Substantially? Instead of using a constant, we'd use a per-domain variable. Not much of a complication afaict. > What happens if at some point later, we try to boot a PV guest which has > a different PTE layout to what Xen has chosen? We definitely can't > update every PTE with a newly-chosen layout. Just to re-iterate: A per domain selection of bit mask(s). >>> I was intending to have CONFIG_PV_PTE_DEBUG as an EXPERT option, >>> disabled by default even in debug builds. >>> >>> There should not be an ABI difference between release and "normal" debug >>> builds. >> Well, I see your point, but as said above I'm not convinced >> disabling all that code is the right solution. In fact, what you >> propose is not far away from removing that code altogether. > > The two bits are only used for specialised debugging. They should be > relegated to people doing specific debugging, and not interfere with the > overwhelming majority of cases where Xen doesn't need to use any > software available PTE bits. Your repeated claim that _PAGE_GUEST_KERNEL is purely debugging only makes we wonder how you would mean to adjust adjust_guest_l1e() with that flag gone (most notably the last of its if()-s). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |