[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 3/4] HVM x86 deprivileged mode: Code for switching into/out of deprivileged mode



On 07/08/15 13:51, Ben Catterall wrote:
> On 06/08/15 21:55, Andrew Cooper wrote:
>> On 06/08/15 17:45, Ben Catterall wrote:
>>> The process to switch into and out of deprivileged mode can be
>>> likened to
>>> setjmp/longjmp.
>>>
>>> To enter deprivileged mode, we take a copy of the stack from the
>>> guest's
>>> registers up to the current stack pointer. This allows us to restore
>>> the stack
>>> when we have finished the deprivileged mode operation, meaning we
>>> can continue
>>> execution from that point. This is similar to if a context switch
>>> had happened.
>>>
>>> To exit deprivileged mode, we copy the stack back, replacing the
>>> current stack.
>>> We can then continue execution from where we left off, which will
>>> unwind the
>>> stack and free up resources. This method means that we do not need to
>>> change any other code paths and its invocation will be transparent
>>> to callers.
>>> This should allow the feature to be more easily deployed to
>>> different parts
>>> of Xen.
>>>
>>> Note that this copy of the stack is per-vcpu but, it will contain
>>> per-pcpu data.
>>> Extra work is needed to properly migrate vcpus between pcpus.
>>
>> Under what circumstances do you see there being persistent state in the
>> depriv area between calls, given that the calls are synchronous from VM
>> actions?
>
> I don't know if we can make these synchronous as we need a way to
> interrupt the vcpu if it's spinning for a long time. Otherwise an
> attacker could just spin in depriv and cause a DoS. With that in mind,
> the scheduler may decide to migrate the vcpu whilst it's in depriv
> mode which would mean this per-pcpu data is held in the stack copy
> which is then migrated to another pcpu incorrectly.

If the emulator spins for a sufficient time, it is fine to shoot the
domain.  This is a strict improvement on the current behaviour where a
spinning emulator would shoot the host, via a watchdog timeout.

As said elsewhere, this kind of DoS is not a very interesting attack
vector.  State handling errors which cause Xen to change the wrong thing
are far more interesting from a guests point of view.

http://xenbits.xen.org/xsa/advisory-123.html (full host compromise) or
http://xenbits.xen.org/xsa/advisory-108.html (read other guests data)
are examples of kinds of interesting issues which could potentially be
mitigated with this depriv infrastructure.

>
>>
>>>
>>> The switch to and from deprivileged mode is performed using sysret
>>> and syscall
>>> respectively.
>>
>> I suspect we need to borrow the SS attribute workaround from Linux to
>> make this function reliably on AMD systems.
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=61f01dd941ba9e06d2bf05994450ecc3d61b6b8b
>>
>>
> >
> Ah! ok, I'll look into this. Thanks!

Just be aware of it.  Don't spend your time attempting to retrofit it to
Xen.  It is more work than it looks.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.