[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen PV PTE ABI (or lack thereof)
>>> On 20.01.16 at 21:10, <andrew.cooper3@xxxxxxxxxx> wrote: > First of all, SMEP and SMAP. 32bit PV guests are subject to Xen's > SMEP/SMAP choices, because of running in ring 1. > > SMAP in particular is problematic because older Linux guests do fall > foul of it; they don't understand what a SMAP pagefault is, and enter an > infinite loop of pagefaults. SMEP is also problematic because it breaks > any guest wishing to use a shared address space between kernel and > user. (I had some fun getting the test framework to function until I > twigged what was happening). > > Both of these are regressions; older guests relying on existing > behaviour cease to function on newer hardware/Xen despite identical > settings. And for both of them there simply should be a way for the guest to state whether it's compatible (which should be the case for anything we can't deal with completely transparently to guests). > For the PTE bits, _PAGE_GNTTAB (bit 62) is used exclusively in debug > build (so there is a guest observable difference between running on a > debug and a non-debug Xen), and the comment beside it even identifies > that it breaks BSD guests. PTE bits 62:59 used by hardware if CR4.PKE > is set. Currently this means that we are not able to support Protection > Key for PV guests (although this restriction technically only applies to > debug builds of the hypervisor). > > The other PTE bit used by Xen is _PAGE_GUEST_KERNEL (bit 52). This bit > is used to notice when a 64bit PV guest attempts to override the fixup > Xen applies to its PTEs. Xen unilaterally sets _PAGE_GLOBAL for user > pages, and clears _PAGE_GLOBAL for supervisor mappings, setting > _PAGE_USER in both cases as the PV kernel runs in ring3. The only thing > _PAGE_GUEST_KERNEL is used for is to notice when the kernel deliberately > tries to create a _PAGE_GUEST_KERNEL|_PAGE_GLOBAL, at which point a > warning is logged and the kernel overridden. > > > Neither of the used PTE bits exist in the Xen public ABI. Neither of > them serve a purpose other than a debugging aid. > > I propose hiding them behind CONFIG_PV_PTE_DEBUG and declaring an ABI of > "all bits available for guest use". And a kernel using any of the conflicting bits would then become unusable on a hypervisor with that debug option enabled? I'd rather see us document the state things are in... > The other question is what we do when it comes to %cr4 and PV guests. > > The current SMAP issue is a blocker for XenServer, and I have some nasty > logic to fix up behind the guests back. I have only just discovered the > SMEP issue, but it is still a regression (again, nothing states that a > PV guest must have a split address space; segmentation is a perfectly > valid option in 32bit guests). The PK issue is one which shouldn't be > an issue for us to implement in PV guests. > > I am leaning towards allowing a toolstack to permit a PV guest to be > able to play with a few more CR4 bits. We can't give a guest kernel > complete carte blanche, because of the security implications. However, > we do already context switch CR4 for PV guests, so a few extra bits on > a "nominated safe" domain is no extra hassle. Sounds reasonable. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |