|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 6/6] mini-os/x86-64 entry: check against nested events and try to fix up
On 03/12/2013 07:42 PM, Xu Zhang wrote:
> On 03/09/2013 04:44 PM, Jeremy Fitzhardinge wrote:
>> On 03/08/2013 01:30 PM, Xu Zhang wrote:
>>> +# [How we do the fixup]. We want to merge the current stack frame
>>> with the
>>> +# just-interrupted frame. How we do this depends on where in the
>>> critical
>>> +# region the interrupted handler was executing, and so how many saved
>>> +# registers are in each frame. We do this quickly using the lookup
>>> table
>>> +# 'critical_fixup_table'. For each byte offset in the critical
>>> region, it
>>> +# provides the number of bytes which have already been popped from the
>>> +# interrupted stack frame. This is the number of bytes from the
>>> current stack
>>> +# that we need to copy at the end of the previous activation frame
>>> so that
>>> +# we can continue as if we've never even reached 11 running in the old
>>> +# activation frame.
>>> +critical_region_fixup:
>>> + addq $critical_fixup_table - scrit, %rax
>>> + movzbq (%rax),%rax # %rax contains num bytes popped
>>> + mov %rsp,%rsi
>>> + add %rax,%rsi # %esi points at end of src region
>>> +
>>> + movq RSP(%rsp),%rdi # acquire interrupted %rsp from
>>> current stack frame
>>> + # %edi points at end of dst region
>>> + mov %rax,%rcx
>>> + shr $3,%rcx # convert bytes into count of 64-bit
>>> entities
>>> + je 16f # skip loop if nothing to copy
>>> +15: subq $8,%rsi # pre-decrementing copy loop
>>> + subq $8,%rdi
>>> + movq (%rsi),%rax
>>> + movq %rax,(%rdi)
>>> + loop 15b
>>> +16: movq %rdi,%rsp # final %rdi is top of merged stack
>>> + andb $KERNEL_CS_MASK,CS(%rsp) # CS on stack might have
>>> changed
>>> + jmp 11b
>>> +
>>> +
>>> +/* Nested event fixup look-up table*/
>>> +critical_fixup_table:
>>> + .byte 0x00,0x00,0x00 # XEN_TEST_PENDING(%rsi)
>>> + .byte 0x00,0x00,0x00,0x00,0x00,0x00 # jnz 14f
>>> + .byte 0x00,0x00,0x00,0x00 # mov (%rsp),%r15
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # mov 0x8(%rsp),%r14
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # mov 0x10(%rsp),%r13
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # mov 0x18(%rsp),%r12
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # mov 0x20(%rsp),%rbp
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # mov 0x28(%rsp),%rbx
>>> + .byte 0x00,0x00,0x00,0x00 # add $0x30,%rsp
>>> + .byte 0x30,0x30,0x30,0x30 # mov (%rsp),%r11
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x8(%rsp),%r10
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x10(%rsp),%r9
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x18(%rsp),%r8
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x20(%rsp),%rax
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x28(%rsp),%rcx
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x30(%rsp),%rdx
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x38(%rsp),%rsi
>>> + .byte 0x30,0x30,0x30,0x30,0x30 # mov 0x40(%rsp),%rdi
>>> + .byte 0x30,0x30,0x30,0x30 # add $0x50,%rsp
>>> + .byte 0x80,0x80,0x80,0x80 # testl
>>> $NMI_MASK,2*8(%rsp)
>>> + .byte 0x80,0x80,0x80,0x80
>>> + .byte 0x80,0x80 # jnz 2f
>>> + .byte 0x80,0x80,0x80,0x80 # testb
>>> $1,(xen_features+XENFEAT_supervisor_mode_kernel)
>>> + .byte 0x80,0x80,0x80,0x80
>>> + .byte 0x80,0x80 # jnz 1f
>>> + .byte 0x80,0x80,0x80,0x80,0x80 # orb $3,1*8(%rsp)
>>> + .byte 0x80,0x80,0x80,0x80,0x80 # orb $3,4*8(%rsp)
>>> + .byte 0x80,0x80 # iretq
>>> + .byte 0x80,0x80,0x80,0x80 # andl $~NMI_MASK,
>>> 16(%rsp)
>>> + .byte 0x80,0x80,0x80,0x80
>>> + .byte 0x80,0x80 # pushq $\flag
>>> + .byte 0x78,0x78,0x78,0x78,0x78 # jmp hypercall_page
>>> + (__HYPERVISOR_iret * 32)
>>> + .byte 0x00,0x00,0x00,0x00 #
>>> XEN_LOCKED_BLOCK_EVENTS(%rsi)
>>> + .byte 0x00,0x00,0x00 # mov %rsp,%rdi
>>> + .byte 0x00,0x00,0x00,0x00,0x00 # jmp 11b
>> This looks super-fragile. The original Xen-linux kernel code had a
>> similar kind of fixup table, but I went to some lengths to make it as
>> simple and robust as possible in the pvops kernels. See the comment in
>> xen-asm_32.S:
>>
>> * Because the nested interrupt handler needs to deal with the current
>> * stack state in whatever form its in, we keep things simple by only
>> * using a single register which is pushed/popped on the stack.
>>
>> 64-bit pvops Linux always uses the iret hypercall, so the issue is moot
>> there. (In principle a nested kernel interrupt could avoid the iret,
>> but it wasn't obvious that the extra complexity was worth it.)
>>
>> J
>>
>
> Thanks very much for you feedbacks.
>
> I guess the concern is that using a chunky look-up table like this is
> somewhat complicated and would be less easier to maintain? Samuel made
> some suggestions, making it less "fragile" to some extent:
>
> Also, it would be good to check against critical section size change, in
> case somebody e.g. changes a value, or a macro like XEN_PUT_VCPU_INFO.
> For instance, stuff right after the table:
>
> .if (ecrit-scrit) != (critical_fixup_table_end -
> critical_fixup_table)
> .error "The critical has changed, the fixup table needs updating"
> .endif
That would catch some problems, but not all. I worry that the whole
construction is too error-prone, and its too easy for some later
unsuspecting developer to come in and introduce a subtle bug that only
occurs under very particular timing conditions.
> I read through Xen-Linux 32bit implementation for this issue. If I
> understand it right, the use of look-up table is avoided by leaving
> RESTORE_ALL/REST out of the critical section. So when a nested event
> came, you always certain about the stack frame layout and how many
> bytes to copy over when fixing up (in Xen-Linux 32bit, PT_EIP/4 - 1).
> And the register that would be clobbered in order to read and write
> event mask (%eax in Xen-Linux 32bit case) shall be saved on stack.
> This time we only need to worry about this one register instead of
> many when fixing up the stack frame.
Right.
> Basically both approaches are correct to me and do the same job.
> Xen-Linux 32bit is definitely a more neat solution. However, one thing
> to notice is that mini-os x86 32bit also uses a fixup table.
I don't have mini-os in front of me at the moment, but I wouldn't be
surprised if the original Xen-Linux and mini-os share heritage in this area.
> I guess we could apply these patches as a temporary solution to make
> sure everything is correct and consistent, and having following
> patches to do the refinement.
That makes sense to me if the refinement is a large amount of new work
on top of what you've already done, but I don't think that's obviously
true. Given that you're familiar with the code and the problem domain,
I expect you could knock out a clean solution pretty quickly.
I guess you're interested in this path for 64-bit as well, because a
same ring-ring interrupt return is much more likely in minios.
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |