[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/ctxt-switch: Document and improve GDT handling
On 05/07/2019 14:56, Jan Beulich wrote: > On 05.07.2019 15:36, Andrew Cooper wrote: >> On 05/07/2019 11:00, Jan Beulich wrote: >>> On 04.07.2019 19:57, Andrew Cooper wrote: >>>> Also, it should now be very obvious to people that Xen's current GDT >>>> handling >>>> for non-PV vcpus is a recipe subtle bugs, if we ever manage to execute a >>>> stray >>>> mov/pop %sreg instruction. We really ought to have Xen's regular GDT in an >>>> area where slots 0-13 are either mapped to the zero page, or not present, >>>> so >>>> we don't risk loading a non-faulting garbage selector. >>> Well, there's certainly room for improvement, but loading a stray >>> selector seems pretty unlikely an event to happen, and the >>> respective code having got slipped in without anyone noticing. >>> Other than in context switching code I don't think there are many >>> places at all where we write to the selector registers. >> There are however many places where we write some bytes into a stub and >> then execute them. >> >> This isn't a security issue. There aren't any legitimate codepaths for >> which is this a problem, but there are plenty of cascade failures where >> this is liable to make a bad situation worse is weird hard-to-debug ways. >> >> Not to mention that for security hardening purposes, we should be using >> a RO mapping to combat sgdt or fixed-ABI knowledge from an attacker. > Okay, I can see how SGDT can reveal undue knowledge to an attacker, > even if they can't use the address directly, but only to infer further > things (hence UMIP's existence). But I'm having trouble seeing how a > r/o mapped GDT adds much security. Could you expand on this? If an attacker has some kind of write-gadget already in Xen, knowing where a writeable mapping to the GDT is makes their life very easy in terms of getting code execution. RO mappings are a defence in depth measure. > >> And on that note... nothing really updates the full GDT via the >> perdomain mappings, so I think that can already move to being RO. This >> does depend on the fact that noone has used segmented virtual memory >> since long before Xen was a thing. We can trap and emulate the setting >> of A bits, and I bet that path will never get hit even with old PV guests. > Well, you could go the first step here and map the Xen page r/o right > away. There we don't even need to fear A bits needing to get set, since > we control the contents. > >>>> @@ -1718,15 +1737,12 @@ static void __context_switch(void) >>>> >>>> psr_ctxt_switch_to(nd); >>>> >>>> - gdt = !is_pv_32bit_domain(nd) ? per_cpu(gdt_table, cpu) : >>>> - per_cpu(compat_gdt_table, cpu); >>>> - >>>> if ( need_full_gdt(nd) ) >>>> - write_full_gdt_ptes(gdt, n); >>>> + update_xen_slot_in_full_gdt(n, cpu); >>>> >>>> if ( need_full_gdt(pd) && >>>> ((p->vcpu_id != n->vcpu_id) || !need_full_gdt(nd)) ) >>>> - load_default_gdt(gdt, cpu); >>>> + load_default_gdt(cpu); >>> From looking at this transformation I cannot see how, as said in >>> the description and as expressed by removing the gdt parameter >>> from load_default_gdt(), the gdt having got passed in here would >>> always have been per_cpu(gdt_table, cpu). It pretty clearly was >>> the compat one for nd being 32-bit PV. What am I missing? >> To be perfectly honest, I wrote "how it {does,should} logically work", >> then adjusted the code. >> >>> Or is the description perhaps instead meaning to say that it doesn't >>> _need_ to be the compat one that we load here, as in case it is >>> the subsequent load_full_gdt() will replace it again anyway? >> lgdt is an expensive operation. I hadn't even spotted that we are doing >> it twice on that path. There is surely some room for improvement here >> as well. > Well, the double load had to be added for a very simple reason: > While we're switching page tables, the GDT mapping has to remain > intact. Hence the dependency upon the two vCPU IDs (not) matching. > IOW I don't currently see room for improvement here. Urgh. That is a nasty corner case. We really don't want to be doing two lgdt's, but it is currently unavoidable when we're switching with p->vcpu_id > nd->max_vcpus, as the current gdtr over p's full GDT will become not present when n has fewer vcpus. For now, I'm tempted to go with an answer of "it doesn't need to be the compat GDT". This is a rabbit hole I really don't have time to follow atm. (Bring on the non-flat address space days, where all of this complexity can disappear totally.) ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |