[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/4] x86: suppress SMAP and SMEP while running 32-bit PV guest code
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: Friday, March 4, 2016 7:28 PM > To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Wu, Feng > <feng.wu@xxxxxxxxx>; Keir Fraser <keir@xxxxxxx> > Subject: [PATCH 2/4] x86: suppress SMAP and SMEP while running 32-bit PV > guest code > > Since such guests' kernel code runs in ring 1, their memory accesses, > at the paging layer, are supervisor mode ones, and hence subject to > SMAP/SMEP checks. Such guests cannot be expected to be aware of those > two features though (and so far we also don't expose the respective > feature flags), and hence may suffer page faults they cannot deal with. > > While the placement of the re-enabling slightly weakens the intended > protection, it was selected such that 64-bit paths would remain > unaffected where possible. At the expense of a further performance hit > the re-enabling could be put right next to the CLACs. > > Note that this introduces a number of extra TLB flushes - CR4.SMEP > transitioning from 0 to 1 always causes a flush, and it transitioning > from 1 to 0 may also do. > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -67,6 +67,8 @@ boolean_param("smep", opt_smep); > static bool_t __initdata opt_smap = 1; > boolean_param("smap", opt_smap); > > +unsigned long __read_mostly cr4_smep_smap_mask; > + > /* Boot dom0 in pvh mode */ > static bool_t __initdata opt_dom0pvh; > boolean_param("dom0pvh", opt_dom0pvh); > @@ -1335,6 +1337,8 @@ void __init noreturn __start_xen(unsigne > if ( cpu_has_smap ) > set_in_cr4(X86_CR4_SMAP); > > + cr4_smep_smap_mask = mmu_cr4_features & (X86_CR4_SMEP | > X86_CR4_SMAP); > + > if ( cpu_has_fsgsbase ) > set_in_cr4(X86_CR4_FSGSBASE); > > @@ -1471,7 +1475,10 @@ void __init noreturn __start_xen(unsigne > * copy_from_user(). > */ > if ( cpu_has_smap ) > + { > + cr4_smep_smap_mask &= ~X86_CR4_SMAP; You change ' cr4_smep_smap_mask ' here ... > write_cr4(read_cr4() & ~X86_CR4_SMAP); > + } > > printk("%sNX (Execute Disable) protection %sactive\n", > cpu_has_nx ? XENLOG_INFO : XENLOG_WARNING "Warning: ", > @@ -1488,7 +1495,10 @@ void __init noreturn __start_xen(unsigne > panic("Could not set up DOM0 guest OS"); > > if ( cpu_has_smap ) > + { > write_cr4(read_cr4() | X86_CR4_SMAP); > + cr4_smep_smap_mask |= X86_CR4_SMAP; ... and here. I am wonder whether it is because Domain 0 can actually start running between the two place? Or I don't think the changes in the above two places is needed. right? > > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -434,6 +434,7 @@ ENTRY(dom_crash_sync_extable) > > ENTRY(common_interrupt) > SAVE_ALL CLAC > + SMEP_SMAP_RESTORE > movq %rsp,%rdi > callq do_IRQ > jmp ret_from_intr > @@ -454,13 +455,64 @@ ENTRY(page_fault) > GLOBAL(handle_exception) > SAVE_ALL CLAC > handle_exception_saved: > + GET_CURRENT(%rbx) > testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp) > jz exception_with_ints_disabled > - sti > + > +.Lsmep_smap_orig: > + jmp 0f > + .if 0 // GAS bug (affecting at least 2.22 ... 2.26) > + .org .Lsmep_smap_orig + (.Lsmep_smap_alt_end - .Lsmep_smap_alt), 0xcc > + .else > + // worst case: rex + opcode + modrm + 4-byte displacement > + .skip (1 + 1 + 1 + 4) - 2, 0xcc > + .endif > + .pushsection .altinstr_replacement, "ax" > +.Lsmep_smap_alt: > + mov VCPU_domain(%rbx),%rax > +.Lsmep_smap_alt_end: > + .section .altinstructions, "a" > + altinstruction_entry .Lsmep_smap_orig, .Lsmep_smap_alt, \ > + X86_FEATURE_SMEP, \ > + (.Lsmep_smap_alt_end - .Lsmep_smap_alt), \ > + (.Lsmep_smap_alt_end - .Lsmep_smap_alt) > + altinstruction_entry .Lsmep_smap_orig, .Lsmep_smap_alt, \ > + X86_FEATURE_SMAP, \ > + (.Lsmep_smap_alt_end - .Lsmep_smap_alt), \ > + (.Lsmep_smap_alt_end - .Lsmep_smap_alt) > + .popsection > + > + testb $3,UREGS_cs(%rsp) > + jz 0f > + cmpb $0,DOMAIN_is_32bit_pv(%rax) > + je 0f > + call cr4_smep_smap_restore > + /* > + * An NMI or #MC may occur between clearing CR4.SMEP and CR4.SMAP in Do you mean "before" when you typed "between" above? > + * compat_restore_all_guest and it actually returning to guest > + * context, in which case the guest would run with the two features > + * enabled. The only bad that can happen from this is a kernel mode > + * #PF which the guest doesn't expect. Rather than trying to make the > + * NMI/#MC exit path honor the intended CR4 setting, simply check > + * whether the wrong CR4 was in use when the #PF occurred, and exit > + * back to the guest (which will in turn clear the two CR4 bits) to > + * re-execute the instruction. If we get back here, the CR4 bits > + * should then be found clear (unless another NMI/#MC occurred at > + * exactly the right time), and we'll continue processing the > + * exception as normal. > + */ As Andrew mentioned in another mail, this scenario is a little tricky, could you please give a more detailed description about how the MNI/#MC affect the execution flow, maybe with some code in the explanation? Thanks a lot! Thanks, Feng _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |