[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5] xen/privcmd.c: improve performance of MMAPBATCH_V2.



On Jan 2, 2013, at 11:45 AM, Mats Petersson <mats.petersson@xxxxxxxxxx> wrote:

> On 02/01/13 15:47, Andres Lagar-Cavilla wrote:
>> Hi,
>> On Dec 20, 2012, at 6:32 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
>> 
>>> On Thu, 2012-12-20 at 11:00 +0000, Mats Petersson wrote:
>>> 
>>>>>> -       err = 0;
>>>>>> +       err = last_err;
>>>>> This means that if you have 100 failures followed by one success you
>>>>> return success overall. Is that intentional? That doesn't seem right.
>>>> As far as I see, it doesn't mean that. last_err is only set at the
>>>> beginning of the call (to zero) and if there is an error.
>>> Ah, yes I missed that (but it still isn't right, see below).
>>> 
>>>>>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>>>>>> index 0bbbccb..8f86a44 100644
>>>>>> --- a/drivers/xen/privcmd.c
>>>>>> +++ b/drivers/xen/privcmd.c
>>>>> [...]
>>>>>> @@ -283,12 +317,10 @@ static int mmap_batch_fn(void *data, void *state)
>>>>>>         if (xen_feature(XENFEAT_auto_translated_physmap))
>>>>>>                 cur_page = pages[st->index++];
>>>>>> 
>>>>>> -       ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, 
>>>>>> *mfnp, 1,
>>>>>> -                                        st->vma->vm_page_prot, 
>>>>>> st->domain,
>>>>>> -                                        &cur_page);
>>>>>> -
>>>>>> -       /* Store error code for second pass. */
>>>>>> -       *(st->err++) = ret;
>>>>>> +       BUG_ON(nr < 0);
>>>>>> +       ret = xen_remap_domain_mfn_array(st->vma, st->va & PAGE_MASK, 
>>>>>> mfnp, nr,
>>>>>> +                                        st->err, st->vma->vm_page_prot,
>>>>>> +                                        st->domain, &cur_page);
>>>>>> 
>>>>>>         /* And see if it affects the global_error. */
>>>>> The code which follows needs adjustment to cope with the fact that we
>>>>> now batch. I think it needs to walk st->err and set global_error as
>>>>> appropriate. This is related to my comment about the return value of
>>>>> xen_remap_domain_mfn_range too.
>>>> The return value, as commented above, is "either 0 or the last error we
>>>> saw". There should be no need to walk the st->err, as we know if there
>>>> was some error or not.
>>> The code which follows tries to latch having seen ENOENT in preference
>>> to other errors, so you don't need 0 or the last error, you need either
>>> 0, ENOENT if one occurred somewhere in the batch or an error!=ENOENT.
>>> (or maybe its vice versa, either way the last error isn't necessarily
>>> what you need).
>> Yeah the error reporting interface sucks. All I can say in my defense is 
>> that it was that way when I got here.
>> 
>> FWIW EFAULT *takes* precedence over ENOENT.
>> /* If we have not had any EFAULT-like global errors then set the global
>>  * error to -ENOENT if necessary. */
>>    if ((ret == 0) && (state.global_error == -ENOENT))
>>          ret = -ENOENT;
>> 
>> Otherwise, any individual mapping that fails differently from ENOENT does 
>> not affect the global error. That is, in absence of global errors like 
>> EFAULT, and in absence of per-mapping ENOENT return codes, per mapping 
>> errors like EINVAL do not affect the ioctl return code which remains zero.
>> 
>> IIUC Ian has been pointing that your code breaks this last statement.
> Sorry, are you saying the correct behaviour is to return 0 if we have EINVAL 
> or some other error [aside from EFAULT and ENOENT]?
Yes. Note that before we get to actual mapping work we may fail with ENOMEM, 
EINVAL, etc (but your patch does not affect that).

> Or are you saying that my code is broken such that it does this?
I didn't grok the patch as far as being sure whether it does it or not. I'll 
look at your next version.

Thanks
Andres

> 
> I did some fixes, that I believe "does the right thing", before Christmas, 
> but since I knew most people wouldn't be available, I didn't post it. I will 
> try to get round to it before the end of the week.
> 
> --
> Mats
>> 
>> Thanks
>> Andres
>>>>> I think rather than trying to fabricate some sort of meaningful error
>>>>> code for an entire batch xen_remap_domain_mfn_range should just return
>>>>> an indication about whether there were any errors or not and leave it to
>>>>> the caller to figure out the overall result by looking at the err array.
>>>>> 
>>>>> Perhaps return either the number of errors or the number of successes
>>>>> (which turns the following if into either (ret) or (ret < nr)
>>>>> respectively).
>>>> I'm trying to not change how the code above expects things to work.
>>>> Whilst it would be lovely to rewrite the entire code dealing with
>>>> mapping memory, I don't think that's within the scope of my current
>>>> project. And if I don't wish to rewrite all of libxc's memory management
>>> I'm not talking about changing the libxc or ioctl interface, only about
>>> the internal-to-Linux interface between privcmd and mmu.c. In fact I'm
>>> only actually asking you to change the return value of the new function
>>> you are adding to the API.
>>> 
>>>> code, I don't want to alter what values are returned or when. The
>>>> current code follows what WAS happening before my changes - which isn't
>>>> exactly the most fantastic thing, and I think there may actually be bugs
>>>> in there, such as:
>>>>     if (ret < 0) {
>>>>         if (ret == -ENOENT)
>>>>             st->global_error = -ENOENT;
>>>>         else {
>>>>             /* Record that at least one error has happened. */
>>>>             if (st->global_error == 0)
>>>>                 st->global_error = 1;
>>>>         }
>>>>     }
>>>> if we enter this once with -EFAULT, and then after that with -ENOENT,
>>>> global_error will say -ENOENT. I think knowing that we got an EFAULT is
>>>> "higher importance" than ENOENT, but that's how the old code was
>>>> working, and I'm not sure I should change it at this point.
>>> But I think you are changing it, by this behaviour or returning the last
>>> error in the batch which causes this code to behave differently. That's
>>> what I'm asking you to avoid!
>>> 
>>>>>>         if (ret < 0) {
>>>>>> @@ -300,7 +332,7 @@ static int mmap_batch_fn(void *data, void *state)
>>>>>>                                 st->global_error = 1;
>>>>>>                 }
>>>>>>         }
>>>>>> -       st->va += PAGE_SIZE;
>>>>>> +       st->va += PAGE_SIZE * nr;
>>>>>> 
>>>>>>         return 0;
>>>>>>  }
>>>>>> @@ -430,8 +462,8 @@ static long privcmd_ioctl_mmap_batch(void __user 
>>>>>> *udata, int version)
>>>>>>         state.err           = err_array;
>>>>>> 
>>>>>>         /* mmap_batch_fn guarantees ret == 0 */
>>>>>> -       BUG_ON(traverse_pages(m.num, sizeof(xen_pfn_t),
>>>>>> -                            &pagelist, mmap_batch_fn, &state));
>>>>>> +       BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
>>>>>> +                                   &pagelist, mmap_batch_fn, &state));
>>>>> Can we make traverse_pages and _block common by passing a block size
>>>>> parameter?
>>>> Yes of course. Is there much benefit from that? I understand that it's
>>>> less code, but it also makes the original traverse_pages more complex.
>>>> Not convinced it helps much - it's quite a small function, so not much
>>>> extra code. Additionally, all of the callback functions will have to
>>>> deal with an extra parameter (that is probably ignored in all but one
>>>> place).
>>> It depends on what the combined version ends up looking like but in
>>> general I'm in favour of one more generic function over several cloned
>>> and hacked versions of the same thing.
>>> 
>>> Ian.
>>> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.