[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Device model operation hypercall (DMOP, re qemu depriv)



On 15/08/16 11:19, Jan Beulich wrote:
>>>> On 15.08.16 at 11:39, <george.dunlap@xxxxxxxxxx> wrote:
>> On 12/08/16 12:50, Jan Beulich wrote:
>>>>>> On 12.08.16 at 11:44, <george.dunlap@xxxxxxxxxx> wrote:
>>>> On 09/08/16 12:30, Jan Beulich wrote:
>>>>>>>> On 09.08.16 at 12:48, <ian.jackson@xxxxxxxxxxxxx> wrote:
>>>>>> Jan Beulich writes ("Re: Device model operation hypercall (DMOP, re qemu 
>>>>>> depriv)"):
>>>>>>> Actually, having thought about this some more, and taking this
>>>>>>> together with the expectations to the privcmd driver previously
>>>>>>> outlined, I think this part is problematic: If all the driver is to know
>>>>>>> is the position (within the interface structure) of the target domain
>>>>>>> ID, then any guest handles embedded in the interface structure
>>>>>>> (XEN_HVMCTL_track_dirty_vram only for now) couldn't get
>>>>>>> validated, and hence user mode code would have a way to access
>>>>>>> or modify kernel memory.
>>>>>>
>>>>>> Could the hypervisor know the difference between user and kernel
>>>>>> memory, in principle ?
>>>>>
>>>>> Not without further new hypercalls, as the guest kernel would need
>>>>> to tell Xen what address ranges are kernel vs user (and that implies
>>>>> that any OS wishing to be able to act as Dom0 has a uniform
>>>>> separation of address spaces).
>>>>
>>>> Couldn't Xen tell from the guest pagetables whether the memory being
>>>> accessed was user-mode or kernel mode?
>>>
>>> That would be possible, but would feel like adding heuristics instead
>>> of a proper distinction. Clearly we'd already be in some trouble if
>>> there were cases where some structure doesn't get written to (and
>>> hence could live in user-r/o mapped space), but others would need
>>> to be verified to be user-r/w mapped. A lot of special casing, that is,
>>> and hence of lot of things to be got wrong.
>>>
>>> And then there is the problem of calling code being in rings 1 or 2:
>>> Page tables don't guard ring 0 against such, and we don't even have
>>> the notion of selectors (and hence address ranges) bounding
>>> accessible regions. We can't even say we assume all of them to be
>>> %ds-relative, as it would certainly be legitimate for such a structure
>>> to e.g. live on the stack. Of course an option would be to require
>>> the kernel driver to not allow requests from other than ring 3.
>>>
>>> Plus finally - how would we tell interface structures coming from a
>>> kernel invoked hypercall from those originating from user mode?
>>> There would need to be at least some kind of flag then, which the
>>> privcmd driver set, but normal hypercalls originating in the kernel
>>> don't. Or would you envision to allow this DMOP hypercall to only
>>> be made by user mode tools? If so, does stubdom run its qemu in
>>> ring 3 or rather in ring 0?
>>>
>>> [breaking the order of quoting]
>>>> And unless we're positive the guest kernel will never need these
>>>> hypercalls, we probably need a flag that allows kernel-mode pointers.
>>>
>>> Ah, you actually mention that already.
>>>
>>>>>>  (Would it be sufficient to check the starts, or would
>>>>>> the ends need to be checked too?)
>>>>>
>>>>> Both would need to be checked, so the size field(s) would need to
>>>>> be locatable too.
>>>>
>>>> We could have the "fixed" part of the structure contain domid and an
>>>> array of <ptr, len> which the privcmd driver could check.  I don't think
>>>> that would be terrible.
>>>
>>> Doable, yes, but not really nice, especially for the party invoking
>>> the hypercall as well as the backing implementation in Xen (as
>>> opposed to the privcmd driver, for which such a model would likely
>>> work quite well), as they  then can't use the normal, simple reading
>>> of structure fields, but instead would need to populate array
>>> elements in the right order.
>>
>> So on the whole, what would be your suggestion for how to solve the
>> userspace-pointer problem?
> 
> Well, none of the options considered so far are really nice or
> readily available. I think the easiest to use for both the caller and
> the implementation of the hypercall would be the auxiliary
> hypercall for a kernel to indicate user/kernel boundaries plus a
> flag on the DMOP one for the kernel mode driver to indicate its
> user mode origin. The main (purely theoretical afaict) downside
> of this is the difficulty to use it in OSes with variable user/kernel
> boundaries.

What about including in the "fixed" part of the hypercall a virtual
address range that all pointers must be in?  That wouldn't even require
a user/kernel flag actually; and could conceivably be used by the caller
(either userspace or kernel space) to thwart certain kinds of potential
attacks.

It would take changing the copy_guest() macros to include a potential
range argument, but that shouldn't be too intrusive on the whole, I
wouldn't think.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.