[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen




> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
> Sent: Wednesday, May 07, 2014 6:26 PM
> To: Wu, Feng
> Cc: xen-devel@xxxxxxxxxxxxx; JBeulich@xxxxxxxx; Tian, Kevin; Dong, Eddie;
> Nakajima, Jun; ian.campbell@xxxxxxxxxx
> Subject: Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP) for Xen
> 
> On 07/05/14 09:19, Feng Wu wrote:
> > Supervisor Mode Access Prevention (SMAP) is a new security
> > feature disclosed by Intel, please refer to the following
> > document:
> >
> > http://software.intel.com/sites/default/files/319433-014.pdf
> >
> > If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> > to linear addresses that are accessible in user mode. If CPL < 3,
> > SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> > applies to all supervisor-mode data accesses (these are implicit
> > supervisor accesses) regardless of the value of EFLAGS.AC.
> >
> > This patch enables SMAP in Xen to prevent Xen hypervisor from
> > accessing pv guest data, whose translation paging-structure
> > entries' U/S flags are all set.
> >
> > Signed-off-by: Feng Wu <feng.wu@xxxxxxxxx>
> > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> > ---
> >  docs/misc/xen-command-line.markdown |  7 +++++
> >  xen/arch/x86/setup.c                | 27 +++++++++++++++++
> >  xen/arch/x86/traps.c                | 58
> +++++++++++++++++++++++++++----------
> >  xen/include/asm-x86/cpufeature.h    |  1 +
> >  xen/include/asm-x86/domain.h        |  6 ++--
> >  5 files changed, 82 insertions(+), 17 deletions(-)
> >
> > diff --git a/docs/misc/xen-command-line.markdown
> b/docs/misc/xen-command-line.markdown
> > index 7dc938b..a7ac53d 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
> >
> >  Flag to enable Supervisor Mode Execution Protection
> >
> > +### smap
> 
> ### smap (Intel)
> 
> I know it is inconsistent, but I am trying to get the vendor and arch
> specific command line arguments identified.
> 
> > +> `= <boolean>`
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Supervisor Mode Access Prevention
> > +
> >  ### snb\_igd\_quirk
> >  > `= <boolean>`
> >
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index 2e30701..29b22a1 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
> >  static bool_t __initdata disable_smep;
> >  invbool_param("smep", disable_smep);
> >
> > +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> > +static bool_t __initdata disable_smap;
> > +invbool_param("smap", disable_smap);
> > +
> >  /* **** Linux config option: propagated to domain0. */
> >  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
> >  /* "acpi=force":  Override the disable blacklist.                   */
> > @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >      if ( cpu_has_smep )
> >          set_in_cr4(X86_CR4_SMEP);
> >
> > +    if ( disable_smap )
> > +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> > +    if ( cpu_has_smap )
> > +        set_in_cr4(X86_CR4_SMAP);
> 
> There is a "pushq $0; popf" in __high_start which will clear AC for the
> BSP and all APs when booting.
> 
> Is that considered a sufficient guarantee of the state of eflags for
> booting purposes?  If not, you probably want an explicit clac() here and
> somewhere in start_secondary().

I think that should be enough for the state of EFLAGS. Do you prefer to add 
clac() in
these two places? Thanks!

> 
> > +
> >      if ( cpu_has_fsgsbase )
> >          set_in_cr4(X86_CR4_FSGSBASE);
> >
> > @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >          printk(XENLOG_WARNING
> >                 "Multiple initrd candidates, picking module #%u\n",
> >                 initrdidx);
> > +    /*
> > +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> > +     * construct_dom0(). This looks a little hackish, but using 
> > stac()/clac()
> > +     * to do this is not appropriate here: Since construct_dom0() calls
> > +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole
> in
> > +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> > +     * would induce problems in construct_dom0().
> > +     *
> > +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> > +     * would need to carefully handle some corner cases.
> > +     *
> > +     * So we clear SMAP before construct_dom0() and enable it back
> afterwards.
> > +    */
> 
> Alignment error on the end of the comment.  Furthermore, I feel it is
> more verbose than it needs to be.  May I suggest:
> 
> "Temporarily clear SMAP in CR4 to allow user-accesses in
> construct_dom0().  This saves a large number of corner case interactions
> with copy_from_user()"
> 
> ~Andrew

Sure, will simply the comments as you suggested.

> 
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> >
> >      /*
> >       * We're going to setup domain0 using the module(s) that we stashed
> safely
> > @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >                          bootstrap_map, cmdline) != 0)
> >          panic("Could not set up DOM0 guest OS");
> >
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() | X86_CR4_SMAP);
> > +
> >      /* Scrub RAM that is still free and so may go to an unprivileged 
> > domain.
> */
> >      scrub_heap_pages();
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 19c96d5..ee203cb 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
> >  enum pf_type {
> >      real_fault,
> >      smep_fault,
> > +    smap_fault,
> >      spurious_fault
> >  };
> >
> >  static enum pf_type __page_fault_type(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long mfn, cr3 = read_cr3();
> >      l4_pgentry_t l4e, *l4t;
> > @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
> >      l2_pgentry_t l2e, *l2t;
> >      l1_pgentry_t l1e, *l1t;
> >      unsigned int required_flags, disallowed_flags, page_user;
> > +    unsigned int error_code = regs->error_code;
> >
> >      /*
> >       * We do not take spurious page faults in IRQ handlers as we do not
> > @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
> >      page_user &= l1e_get_flags(l1e);
> >
> >  leaf:
> > -    /*
> > -     * Supervisor Mode Execution Protection (SMEP):
> > -     * Disallow supervisor execution from user-accessible mappings
> > -     */
> > -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> > -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> > -        return smep_fault;
> > +    if ( page_user )
> > +    {
> > +        unsigned long cr4 = read_cr4();
> > +        /*
> > +         * Supervisor Mode Execution Prevention (SMEP):
> > +         * Disallow supervisor execution from user-accessible mappings
> > +         */
> > +        if ( (cr4 & X86_CR4_SMEP) &&
> > +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> > +            return smep_fault;
> > +
> > +        /*
> > +         * Supervisor Mode Access Prevention (SMAP):
> > +         * Disallow supervisor access user-accessible mappings
> > +         * A fault is considered as an SMAP violation if the following
> > +         * conditions are true:
> > +         *   - X86_CR4_SMAP is set in CR4
> > +         *   - A user page is being accessed
> > +         *   - CPL=3 or X86_EFLAGS_AC is clear
> > +         *   - Page fault in kernel mode
> > +         */
> > +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode)
> &&
> > +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> > +            return smap_fault;
> > +    }
> >
> >      return spurious_fault;
> >  }
> >
> >  static enum pf_type spurious_page_fault(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long flags;
> >      enum pf_type pf_type;
> > @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
> >       * page tables from becoming invalid under our feet during the walk.
> >       */
> >      local_irq_save(flags);
> > -    pf_type = __page_fault_type(addr, error_code);
> > +    pf_type = __page_fault_type(addr, regs);
> >      local_irq_restore(flags);
> >
> >      return pf_type;
> > @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(!guest_mode(regs)) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        BUG_ON(pf_type == smep_fault);
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> > +        {
> > +            console_start_sync();
> > +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' :
> 'A');
> > +            fatal_trap(TRAP_page_fault, regs);
> > +        }
> > +
> >          if ( pf_type != real_fault )
> >              return;
> >
> > @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        if ( pf_type == smep_fault )
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
> >          {
> > -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> > +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> > +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> > +
> >              domain_crash(current->domain);
> >          }
> >          if ( pf_type != real_fault )
> > diff --git a/xen/include/asm-x86/cpufeature.h
> b/xen/include/asm-x86/cpufeature.h
> > index 20881c0..8014241 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -190,6 +190,7 @@
> >  #define cpu_has_fsgsbase   boot_cpu_has(X86_FEATURE_FSGSBASE)
> >
> >  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> > +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> >  #define cpu_has_fpu_sel
> (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
> >
> >  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor ==
> X86_VENDOR_AMD) \
> > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> > index c5c266f..abf55fb 100644
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct
> vcpu *, unsigned long guest_cr4);
> >      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
> >        | (mmu_cr4_features                                   \
> >           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> > -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> > +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> > +            X86_CR4_FSGSBASE))                              \
> >        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
> >       & ~X86_CR4_DE)
> >  #define real_cr4_to_pv_guest_cr4(c)                         \
> >      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> > -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> > +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> > +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
> >
> >  void domain_cpuid(struct domain *d,
> >                    unsigned int  input,

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.