[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 1/6] x86: Add support for STAC/CLAC instructions




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Tuesday, April 15, 2014 4:40 PM
> To: Wu, Feng
> Cc: Ian.Campbell@xxxxxxxxxx; Dong, Eddie; Nakajima, Jun;
> xen-devel@xxxxxxxxxxxxx
> Subject: Re: [PATCH v1 1/6] x86: Add support for STAC/CLAC instructions
> 
> >>> On 15.04.14 at 15:01, <feng.wu@xxxxxxxxx> wrote:
> > --- a/xen/include/asm-x86/x86_64/asm_defns.h
> > +++ b/xen/include/asm-x86/x86_64/asm_defns.h
> 
> These changes should go into xen/include/asm-x86/asm_defns.h; the
> x86-64 one should get merged into the previously common one sooner
> or later, so let's not force that merging patch to be bigger than it needs
> to be.
> 
> > @@ -228,4 +228,74 @@
> __asm__(                                        \
> >  # define _ASM_EX(p) #p "-."
> >  #endif
> >
> > +/* "Raw" instruction opcodes */
> > +#define __ASM_CLAC      .byte 0x0f,0x01,0xca
> > +#define __ASM_STAC      .byte 0x0f,0x01,0xcb
> > +
> > +/* Indirect stringification.  Doing two levels allows the parameter to be a
> > + * macro itself.  For example, compile with -DFOO=bar, __stringify(FOO)
> > + * converts to "bar".
> > + */
> > +#define __stringify_1(x...)     #x
> > +#define __stringify(x...)       __stringify_1(x)
> 
> Please use xen/stringify.h rather than repeating its defintions.
> 
> > +
> > +#ifdef __ASSEMBLY__
> > +#define X86_FEATURE_SMAP    (7*32+20)
> 
> No - how would be spot this if we were to re-arrange the feature
> array? Either use (or make usable) the definitions in cpufeature.h,
> or propagate the necessary values through asm-offsets.h.
> 
> > +#define ASM_STAC                                         \
> > +        pushq %rax;                                      \
> > +        leaq boot_cpu_data(%rip),%rax;                   \
> > +        btl $X86_FEATURE_SMAP-7*32, CPUINFO86_leaf7_features(%rax);
> \
> > +        jnc 881f;                                        \
> > +        movq %cr4,%rax;                                  \
> > +        testl $X86_CR4_SMAP,%eax;                        \
> > +        jz 881f;                                         \
> > +        __ASM_STAC;                                      \
> > +881:    popq %rax
> > +
> > +#define ASM_CLAC                                         \
> > +        pushq %rax;                                      \
> > +        leaq boot_cpu_data(%rip),%rax;                   \
> > +        btl $X86_FEATURE_SMAP-7*32, CPUINFO86_leaf7_features(%rax);
> \
> > +        jnc 881f;                                        \
> > +        movq %cr4,%rax;                                  \
> > +        testl $X86_CR4_SMAP,%eax;                        \
> > +        jz 881f;                                         \
> > +        __ASM_CLAC;                                      \
> > +881:    popq %rax
> 
> The only difference between the two macros appears to be the final
> instruction - please define one macro with an argument, and then
> make the two definitions here simple wrappers around that macro.
> 
> That said, the macro contents itself is horrible too: A control register
> access and two conditional branches in code intended to be used in
> fast paths? Definitely not an option. Even the simplest possible
> solution - adding a global flag to be checked here - would already be
> questionable. Hence I think you should at least consider porting over
> proper instruction patching abstraction from Linux.
> 

Jan, I did some investigation about how to handle this two instructions
in Linux, basically, it uses the alternatives mechanism to handle these
kind of cases. Let's take the following definition of ASM_STAC in Linux for 
example:

#define ASM_CLAC                                                        \
        661: ASM_NOP3 ;                                                 \
        .pushsection .altinstr_replacement, "ax" ;                      \
        662: __ASM_CLAC ;                                               \
        .popsection ;                                                   \
        .pushsection .altinstructions, "a" ;                            \
        altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ;       \
        .popsection

ASM_CLAC is defined as NOP by default, it puts the real CLAC instruction in 
section "altinstr_replacement" and
the needed information to " altinstructions " section, which is useful to 
replace the default
definition by the alternative one. Here is the routine call path: start_kernel 
() --> check_bugs() --> alternative_instructions().

In function alternative_instructions(), it will check the related features in 
CPU, if it exists, the alternative definition will
overwrite the default one. So there is no conditional branches after this 
replacement when the Macro is being used.

Do you think we need to port this whole mechanism to Xen to support CLAC/STAC? 
I am not sure if it is a little overkilled.

BTW, from the Linux implementation, I think we don't need to check the 'cr4' 
for the macros, we just need
to check whether the feature exists in the CPU. So is it acceptable to use the 
original code by eliminating the cr4 check?
Thanks a lot in advance!

> > +#else
> > +#define ASM_STAC                                         \
> > +        "\npushq %%rax\n\t"                              \
> > +        "leaq boot_cpu_data(%%rip),%%rax\n\t"            \
> > +        "btl $" __stringify(X86_FEATURE_SMAP) "-7*32,"   \
> > +        __stringify(CPUINFO86_leaf7_features) "(%%rax)\n\t"  \
> > +        "jnc 881f\n\t"                                   \
> > +        "movq %%cr4,%%rax\n\t"                           \
> > +        "testl $" __stringify(X86_CR4_SMAP) ",%%eax\n\t" \
> > +        "jz 881f\n\t"                                    \
> > +         __stringify(__ASM_STAC) "\n\t"                  \
> > +"881:    popq %%rax"
> > +
> > +#define ASM_CLAC                                         \
> > +        "\npushq %%rax\n\t"                              \
> > +        "leaq boot_cpu_data(%%rip),%%rax\n\t"            \
> > +        "btl $" __stringify(X86_FEATURE_SMAP) "-7*32,"   \
> > +        __stringify(CPUINFO86_leaf7_features) "(%%rax)\n\t"  \
> > +        "jnc 881f\n\t"                                   \
> > +        "movq %%cr4,%%rax\n\t"                           \
> > +        "testl $" __stringify(X86_CR4_SMAP) ",%%eax\n\t" \
> > +        "jz 881f\n\t"                                    \
> > +         __stringify(__ASM_CLAC) "\n\t"                  \
> > +"881:   popq %%rax"
> 
> All the same applies to these of course.
> 
> Jan

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.