[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] VMX: streamline entry.S code

>>> On 26.08.13 at 12:44, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> +ENTRY(vmx_asm_vmexit_handler)
>>          push %rdi
>>          push %rsi
>>          push %rdx
>>          push %rcx
>>          push %rax
>> +        mov  %cr2,%rax
> I presume this is a long latency instruction.  Do you have a source of
> numbers for this? (more for interest, as I can easily accept that it
> would be a longer operation than the surrounding ones)

I don't think I ever saw any precise numbers for control register
accesses for anything newer than 386es or maybe 486es. But
control register accesses are slow /not the least because

>>          push %r8
>>          push %r9
>>          push %r10
>>          push %r11
>>          push %rbx
>> +        GET_CURRENT(%rbx)
> This seems a little less obvious.  I presume you are just breaking true
> read-after-write data hazard on %rbx ?

No, this is to hide the latency between loading %rbx and use of
it in the address of a memory access.

>> -.globl vmx_asm_do_vmentry
>> -vmx_asm_do_vmentry:
> If you move the ENTRY(vmx_asm_do_vmentry) up from below, you should be
> able to completely drop the jmp in it.

That would be possible, at the expense of added padding. I prefer
it the way it is now, as vmx_asm_do_vmentry is not performance
critical (as being used exactly once per HVM vCPU).

>  However...
>> +.Lvmx_do_vmentry:
>>          call vmx_intr_assist
>>          call nvmx_switch_guest
>> -        GET_CURRENT(%rbx)
>> -        cli
> The movement of this cli indicates a possible issue.
> If we have softirqs pending, we jump to .Lvmx_process_softirqs, which
> calls do_softirq, and then jumps back to the top of .Lvmx_do_vmentry,
> which reruns the top of do_vmentry with interrupts now enabled.

That was this way already before. The "cli" got moved only past
some address calculation (which clearly doesn't need to be done
with interrupts disabled).

> First of all, I cant see anything in vmx_intr_assist or
> nvmx_switch_guest which should require calling multiple times on a
> vmentry.  They are also expecting to be called with interrupts disabled
> (although I cant spot anything obvious in the callpath which would be
> affected).

And both of these functions had been called before disabling

>> -        cmpb $0,VCPU_vmx_launched(%rbx)
>>          pop  %r15
>>          pop  %r14
>>          pop  %r13
>>          pop  %r12
>>          pop  %rbp
>> +        mov  %rax,%cr2
>> +        cmpb $0,VCPU_vmx_launched(%rbx)
> Again, I presume the move of "mov %rax,%cr2" is about the %rax data hazard?

The %cr2 write's move is indeed debatable - I tried to get it farther
away from the producer of the data in %rax, but it's not clear
whether that's very useful. The second purpose was to get
something interleaved with the many "pop"s, so that the CPU can
get busy other than just its memory load ports. If controversial
I'm fine with undoing that change.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.