[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 6/6] mini-os/x86-64 entry: check against nested events and try to fix up



On 03/08/2013 01:30 PM, Xu Zhang wrote:
> +# [How we do the fixup]. We want to merge the current stack frame with the
> +# just-interrupted frame. How we do this depends on where in the critical
> +# region the interrupted handler was executing, and so how many saved
> +# registers are in each frame. We do this quickly using the lookup table
> +# 'critical_fixup_table'. For each byte offset in the critical region, it
> +# provides the number of bytes which have already been popped from the
> +# interrupted stack frame. This is the number of bytes from the current stack
> +# that we need to copy at the end of the previous activation frame so that
> +# we can continue as if we've never even reached 11 running in the old
> +# activation frame.
> +critical_region_fixup:
> +             addq $critical_fixup_table - scrit, %rax
> +             movzbq (%rax),%rax    # %rax contains num bytes popped
> +             mov  %rsp,%rsi
> +             add  %rax,%rsi        # %esi points at end of src region
> +
> +             movq RSP(%rsp),%rdi   # acquire interrupted %rsp from current 
> stack frame
> +                                   # %edi points at end of dst region
> +             mov  %rax,%rcx
> +             shr  $3,%rcx          # convert bytes into count of 64-bit 
> entities
> +             je   16f              # skip loop if nothing to copy
> +15:          subq $8,%rsi          # pre-decrementing copy loop
> +             subq $8,%rdi
> +             movq (%rsi),%rax
> +             movq %rax,(%rdi)
> +             loop 15b
> +16:          movq %rdi,%rsp        # final %rdi is top of merged stack
> +             andb $KERNEL_CS_MASK,CS(%rsp)      # CS on stack might have 
> changed
> +             jmp  11b
> +
> +
> +/* Nested event fixup look-up table*/
> +critical_fixup_table:
> +     .byte 0x00,0x00,0x00                    # XEN_TEST_PENDING(%rsi)
> +     .byte 0x00,0x00,0x00,0x00,0x00,0x00     # jnz    14f
> +     .byte 0x00,0x00,0x00,0x00               # mov    (%rsp),%r15
> +     .byte 0x00,0x00,0x00,0x00,0x00          # mov    0x8(%rsp),%r14
> +     .byte 0x00,0x00,0x00,0x00,0x00          # mov    0x10(%rsp),%r13
> +     .byte 0x00,0x00,0x00,0x00,0x00          # mov    0x18(%rsp),%r12
> +     .byte 0x00,0x00,0x00,0x00,0x00          # mov    0x20(%rsp),%rbp
> +     .byte 0x00,0x00,0x00,0x00,0x00          # mov    0x28(%rsp),%rbx
> +     .byte 0x00,0x00,0x00,0x00               # add    $0x30,%rsp
> +     .byte 0x30,0x30,0x30,0x30               # mov    (%rsp),%r11
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x8(%rsp),%r10
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x10(%rsp),%r9
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x18(%rsp),%r8
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x20(%rsp),%rax
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x28(%rsp),%rcx
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x30(%rsp),%rdx
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x38(%rsp),%rsi
> +     .byte 0x30,0x30,0x30,0x30,0x30          # mov    0x40(%rsp),%rdi
> +     .byte 0x30,0x30,0x30,0x30               # add    $0x50,%rsp
> +     .byte 0x80,0x80,0x80,0x80               # testl  $NMI_MASK,2*8(%rsp)
> +     .byte 0x80,0x80,0x80,0x80
> +     .byte 0x80,0x80                         # jnz    2f
> +     .byte 0x80,0x80,0x80,0x80               # testb  
> $1,(xen_features+XENFEAT_supervisor_mode_kernel)
> +     .byte 0x80,0x80,0x80,0x80
> +     .byte 0x80,0x80                         # jnz    1f
> +     .byte 0x80,0x80,0x80,0x80,0x80          # orb    $3,1*8(%rsp)
> +     .byte 0x80,0x80,0x80,0x80,0x80          # orb    $3,4*8(%rsp)
> +     .byte 0x80,0x80                         # iretq
> +     .byte 0x80,0x80,0x80,0x80               # andl   $~NMI_MASK, 16(%rsp)
> +     .byte 0x80,0x80,0x80,0x80
> +     .byte 0x80,0x80                         # pushq  $\flag
> +     .byte 0x78,0x78,0x78,0x78,0x78          # jmp    hypercall_page + 
> (__HYPERVISOR_iret * 32)
> +     .byte 0x00,0x00,0x00,0x00               # XEN_LOCKED_BLOCK_EVENTS(%rsi)
> +     .byte 0x00,0x00,0x00                    # mov    %rsp,%rdi
> +     .byte 0x00,0x00,0x00,0x00,0x00          # jmp    11b

This looks super-fragile.  The original Xen-linux kernel code had a
similar kind of fixup table, but I went to some lengths to make it as
simple and robust as possible in the pvops kernels.  See the comment in
xen-asm_32.S:

 * Because the nested interrupt handler needs to deal with the current
 * stack state in whatever form its in, we keep things simple by only
 * using a single register which is pushed/popped on the stack.

64-bit pvops Linux always uses the iret hypercall, so the issue is moot
there.  (In principle a nested kernel interrupt could avoid the iret,
but it wasn't obvious that the extra complexity was worth it.)

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.