[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: x86 instruction emulation backstory?
On 14/04/2023 7:33 pm, Alex Olson wrote: > I've been digging into VMX internals and I see why MMIO emulation pretty much > requires x86 instruction emulation. Even the Linux KVM code borrowed Xen's > emulation... > > Thus, I'm trying to understand Xen's x86 emulation implementation... > > How was it developed? (x86 instruction handling is incredibly complex!) > > Was it originally part of a general purpose x86 emulator? Xen's emulator (in this form at least) is 18 years old - March 2005 https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4c5eeec983495e347c6ab3d40a4a70cdbdfce9af and it was written from scratch, but you can even see in the context for x86/traps.c that emulate_privileged_op() predates that. (We decided to consolidate down to a single instruction decoder/emulator at the point that we were maintaining 4 different ad-hoc ones.) As for development, it's all there in git log if you want to go looking :). > It looks like it implements more instructions than just ones that can access > memory, such as "AAM"? (Why is this)? All instructions have an implicit memory operand at %rip. The CPU has to fetch the opcode bytes from somewhere... (See Introspection, later) You've found MMIO, but emulating from a #GP fault was also an important usecase even back then. PV guest kernels execute in Ring1 (32bit) or Ring3 (64bit), therefore cannot use CPL0 instructions. While PV guests ought to use hypercalls for privileged operations, doing so completely is very expensive in an existing codebase that you're trying to port to Xen. Therefore, Xen will emulate in a few faulting conditions, so the guest can e.g. execute RDMSR and have it function correctly (albeit painfully slowly). More recently, Hypervisor Introspection as a technology opens up a whole load of interesting cases which want emulation. A lot of introspection boils down to removing permissions behind the scenes (e.g. making code no-execute, or making data read-only) so violations cause an exit to the hypervisor, and an introspection agent can make a judgement call. 99% of cases are fine, and should proceed. But, how do you do this? You could lift the permissions, but then malware on other vCPUs now have a window of time where they are free to make modifications. So instead you could pause the VM, lift the perms, singlestep the trapping vCPU, restore them perms, and unpause it. But this has terrible performance to start with, and is an O(N^2) perf hit with then number of vCPUs the VM has. In practice, it is *far* cheaper to have Xen emulate the instruction, than it is to play with pausing, perms and singlestepping. But consider the 1% other case where continuing isn't fine. One of the supported options is to "emulate / discard" to try and skip the instruction without making a real state modification. This cannot be done with singlestepping, and has to be done by software somewhere. As Xen already has an emulator, it's very easy to use a set of write_discard() hooks in place of the real ones. As to the complexity, yes and in truth, Xen's emulator isn't fully an emulator. We pretty much emulate all the integer instructions, because most of them are very simple, but we do not for the vector instructions. What we do for vector instructions is better described as decode and replay, where we reconstruct a modified form of the instruction to operate on local state, so we can piece together the overall reads and writes without needing to implement the vector logic itself. It's also worth saying that for any locked/atomic operations, we have to issue a real instruction too, because that's the only way to get the cache coherency behaviour correct. It is also worth nothing that Xen's emulator isn't complete. Notably, noone has implemented IRET for protected mode yet, or inter-privilege far transfers, and we've got known corner cases (e.g. interrupt shadow with Mov SS) in need of some work. We went through a spate of problems where Windows in particular kept coming up with more and more inventive instructions to use to write into the emulated VGA framebuffer, and we decided that the emulator should be as complete as we can reasonably make it. A consequence of this is that we have some very interesting and powerful advanced security features. I hope this helps, or was at least interesting. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |