[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 10/15] x86/altp2m: add remaining support routines.



>From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>Sent: Thursday, July 16, 2015 2:34 AM
>
>>>> On 16.07.15 at 11:16, <ravi.sahita@xxxxxxxxx> wrote:
>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>>> On 14.07.15 at 02:14, <edmund.h.white@xxxxxxxxx> wrote:
>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>unsigned long gla,
>>>>          if ( npfec.write_access )
>>>>          {
>>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>>> +            /* If p2m is really an altp2m, unlock here to avoid
>>>> + lock
>> ordering
>>>> +             * violation when the change below is propagated from
>>>> + host p2m
>> */
>>>> +            if ( ap2m_active )
>>>> +                __put_gfn(p2m, gfn);
>>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>>> p2m_ram_rw);
>>>
>>>And this won't result in any races?
>>
>> No
>
>To be honest I expected a little more than just "no" here. Now I have to ask -
>why?
>

Yes, I should have described it more than that :-)  so this part of the code is 
handling the log dirty transition of the page, and this page permission 
transition happens always on the hostp2m. Given the way the locking order is 
setup (hostp2m->altp2m-list-lock->altp2m and there was a separate writeup and 
discussion with George on that), at this point in this sequence there is a p2m 
lock (whether it's a hostp2m or altp2m lock depends on the mode of the domain) 
- the reason we have to drop the lock here first is due to what happens next; 
the permission changes in hostp2m will be serially propagated to altp2ms and 
not dropping the lock here would cause a locking order violation. Hope that 
clarifies. 

>>>> --- a/xen/arch/x86/mm/p2m.c
>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>> @@ -2037,6 +2037,391 @@ bool_t
>p2m_switch_vcpu_altp2m_by_id(struct
>>>vcpu *v, uint16_t idx)
>>>>      return rc;
>>>>  }
>>>>
>>>> +void p2m_flush_altp2m(struct domain *d) {
>>>> +    uint16_t i;
>>>> +
>>>> +    altp2m_list_lock(d);
>>>> +
>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>> +    {
>>>> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
>>>> +        /* Uninit and reinit ept to force TLB shootdown */
>>>> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
>>>> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
>>>
>>>ept_... in non-EPT code again.
>>>
>>
>> There is no non-EPT altp2m implementation, and this file already
>> includes ept.. callouts for p2m's implemented using EPT's.
>
>The only two calls currently there are ept_p2m_{,un}init(), which need to be
>there with the current code structuring. Everything else that's EPT-specific
>should be abstracted through hooks set by ept_p2m_init().
>

We hear your input on this one - we would like to treat this change as a 
Category 2 change (per the preface message I sent earlier). I hope that's 
acceptable - also we will prioritize this change above the other change you 
suggested on the altp2m data structure reorganization in patch 5/15.

>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>>> +    long rc = -EINVAL;
>>>
>>>Why long (for both variable and function return type)? (More of these
>>>in functions below.)
>>
>> Because the error variable in the code that calls these (in hvm.c) is
>> a long, and you had given feedback earlier to propagate the returns
>> from these functions through that calling code.
>
>I don't see the connection. The function only returns zero or -E...
>values, so why would its return type be "long"?
>

do_hvm_op declares a rc that is of type "long" and hence this returns a "long"


>>>> +long p2m_init_next_altp2m(struct domain *d, uint16_t *idx) {
>>>> +    long rc = -EINVAL;
>>>> +    uint16_t i;
>>>
>>>As in the earlier patch(es) - unsigned int.
>>
>> Ok, but why does it matter? uint16_t will always suffice.
>
>And will always produce worse code.
>

Ok.

>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>> +p2ma) {
>>>> +    struct p2m_domain *p2m;
>>>> +    p2m_access_t a;
>>>> +    p2m_type_t t;
>>>> +    mfn_t m;
>>>> +    uint16_t i;
>>>> +    bool_t reset_p2m;
>>>> +    unsigned int reset_count = 0;
>>>> +    uint16_t last_reset_idx = ~0;
>>>> +
>>>> +    if ( !altp2m_active(d) )
>>>> +        return;
>>>> +
>>>> +    altp2m_list_lock(d);
>>>> +
>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>> +    {
>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>> +            continue;
>>>> +
>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
>>>> +
>>>> +        reset_p2m = 0;
>>>> +
>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>> +            reset_p2m = 1;
>>>
>>>Considering that this looks like an optimization, what's the downside
>>>of possibly having min=0 and max=<end-of-address-space>? I.e.
>>>can there a long latency operation result that's this way a guest can effect?
>>>
>>
>> ... A p2m is a gfn->mfn map, amongst other things. There is a reverse
>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due to
>> lazy-copy), so handling removal of an mfn from a guest is simple: do a
>> reverse look up for the host p2m and mark the relevant gfn as invalid,
>> then do the same for every altp2m where that gfn is currently valid.
>>
>> Remap changes things: it says take gfn1 and replace ->mfn with the
>> ->mfn of gfn2. Here is where the optimization is used and the  invalidate
>logic is:
>> record the lowest and highest gfn2's that have been used in remap ops;
>> if an mfn is dropped from the hostp2m, for the purposes of altp2m
>> invalidation, see if the gfn derived from the host p2m reverse lookup
>> falls within the range of used gfn2's. If it does, an invalidation is
>> required. Which is why min and max are inited the way they are - hope
>> the explanation clarifies this optimization.
>
>Sadly it doesn't, it just re-states what I already understood and doesn't
>answer the question: What happens if min=0 and max=<end-of-address-
>space>? I.e. can the guest nullify the optimization by careful fiddling issuing
>some of the new hypercalls, and if so will this have any negative impact on the
>hypervisor? I'm asking this from a security standpoint ...
>

To take that exact case, If min=0 and max=<end of address space> then any 
hostp2m change where the first mfn is dropped, will cause all altp2ms to be 
reset even if the mfn dropped doesn't affect altp2ms at all, which wont serve 
as an optimization at all - Hope that clarifies.

>Nor do I find my question answered why max can't be initialized to zero:
>You don't care whether max is a valid GFN when a certain GFN doesn't fall in
>the (then empty) [min, max] range. What am I missing?

Since 0 is a valid GFN so we cannot initialize min or max to 0 - since matching 
this condition (gfn_x(gfn) >= p2m->min_remapped_gfn && gfn_x(gfn) <= 
p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m to rebuild 
it from the hostp2m. So essentially what is being done here is the range is the 
non-existent set to start with, unless some altp2m changes occur, and then it 
is grown to be the smallest set around the gfns affected.

Ravi

>
>Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.