[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V5] xen/privcmd.c: improve performance of MMAPBATCH_V2.



Hi,
On Dec 20, 2012, at 6:32 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:

> On Thu, 2012-12-20 at 11:00 +0000, Mats Petersson wrote:
> 
>>>> -       err = 0;
>>>> +       err = last_err;
>>> This means that if you have 100 failures followed by one success you
>>> return success overall. Is that intentional? That doesn't seem right.
>> As far as I see, it doesn't mean that. last_err is only set at the 
>> beginning of the call (to zero) and if there is an error.
> 
> Ah, yes I missed that (but it still isn't right, see below).
> 
>>>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>>>> index 0bbbccb..8f86a44 100644
>>>> --- a/drivers/xen/privcmd.c
>>>> +++ b/drivers/xen/privcmd.c
>>> [...]
>>>> @@ -283,12 +317,10 @@ static int mmap_batch_fn(void *data, void *state)
>>>>         if (xen_feature(XENFEAT_auto_translated_physmap))
>>>>                 cur_page = pages[st->index++];
>>>> 
>>>> -       ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, 
>>>> *mfnp, 1,
>>>> -                                        st->vma->vm_page_prot, st->domain,
>>>> -                                        &cur_page);
>>>> -
>>>> -       /* Store error code for second pass. */
>>>> -       *(st->err++) = ret;
>>>> +       BUG_ON(nr < 0);
>>>> +       ret = xen_remap_domain_mfn_array(st->vma, st->va & PAGE_MASK, 
>>>> mfnp, nr,
>>>> +                                        st->err, st->vma->vm_page_prot,
>>>> +                                        st->domain, &cur_page);
>>>> 
>>>>         /* And see if it affects the global_error. */
>>> The code which follows needs adjustment to cope with the fact that we
>>> now batch. I think it needs to walk st->err and set global_error as
>>> appropriate. This is related to my comment about the return value of
>>> xen_remap_domain_mfn_range too.
>> The return value, as commented above, is "either 0 or the last error we 
>> saw". There should be no need to walk the st->err, as we know if there 
>> was some error or not.
> 
> The code which follows tries to latch having seen ENOENT in preference
> to other errors, so you don't need 0 or the last error, you need either
> 0, ENOENT if one occurred somewhere in the batch or an error!=ENOENT.
> (or maybe its vice versa, either way the last error isn't necessarily
> what you need).

Yeah the error reporting interface sucks. All I can say in my defense is that 
it was that way when I got here.

FWIW EFAULT *takes* precedence over ENOENT.
/* If we have not had any EFAULT-like global errors then set the global
 * error to -ENOENT if necessary. */
   if ((ret == 0) && (state.global_error == -ENOENT))
         ret = -ENOENT;

Otherwise, any individual mapping that fails differently from ENOENT does not 
affect the global error. That is, in absence of global errors like EFAULT, and 
in absence of per-mapping ENOENT return codes, per mapping errors like EINVAL 
do not affect the ioctl return code which remains zero.

IIUC Ian has been pointing that your code breaks this last statement.

Thanks
Andres
> 
>>> I think rather than trying to fabricate some sort of meaningful error
>>> code for an entire batch xen_remap_domain_mfn_range should just return
>>> an indication about whether there were any errors or not and leave it to
>>> the caller to figure out the overall result by looking at the err array.
>>> 
>>> Perhaps return either the number of errors or the number of successes
>>> (which turns the following if into either (ret) or (ret < nr)
>>> respectively).
>> I'm trying to not change how the code above expects things to work. 
>> Whilst it would be lovely to rewrite the entire code dealing with 
>> mapping memory, I don't think that's within the scope of my current 
>> project. And if I don't wish to rewrite all of libxc's memory management 
> 
> I'm not talking about changing the libxc or ioctl interface, only about
> the internal-to-Linux interface between privcmd and mmu.c. In fact I'm
> only actually asking you to change the return value of the new function
> you are adding to the API.
> 
>> code, I don't want to alter what values are returned or when. The 
>> current code follows what WAS happening before my changes - which isn't 
>> exactly the most fantastic thing, and I think there may actually be bugs 
>> in there, such as:
>>     if (ret < 0) {
>>         if (ret == -ENOENT)
>>             st->global_error = -ENOENT;
>>         else {
>>             /* Record that at least one error has happened. */
>>             if (st->global_error == 0)
>>                 st->global_error = 1;
>>         }
>>     }
>> if we enter this once with -EFAULT, and then after that with -ENOENT, 
>> global_error will say -ENOENT. I think knowing that we got an EFAULT is 
>> "higher importance" than ENOENT, but that's how the old code was 
>> working, and I'm not sure I should change it at this point.
> 
> But I think you are changing it, by this behaviour or returning the last
> error in the batch which causes this code to behave differently. That's
> what I'm asking you to avoid!
> 
>> 
>>>>         if (ret < 0) {
>>>> @@ -300,7 +332,7 @@ static int mmap_batch_fn(void *data, void *state)
>>>>                                 st->global_error = 1;
>>>>                 }
>>>>         }
>>>> -       st->va += PAGE_SIZE;
>>>> +       st->va += PAGE_SIZE * nr;
>>>> 
>>>>         return 0;
>>>>  }
>>>> @@ -430,8 +462,8 @@ static long privcmd_ioctl_mmap_batch(void __user 
>>>> *udata, int version)
>>>>         state.err           = err_array;
>>>> 
>>>>         /* mmap_batch_fn guarantees ret == 0 */
>>>> -       BUG_ON(traverse_pages(m.num, sizeof(xen_pfn_t),
>>>> -                            &pagelist, mmap_batch_fn, &state));
>>>> +       BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
>>>> +                                   &pagelist, mmap_batch_fn, &state));
>>> Can we make traverse_pages and _block common by passing a block size
>>> parameter?
>> Yes of course. Is there much benefit from that? I understand that it's 
>> less code, but it also makes the original traverse_pages more complex. 
>> Not convinced it helps much - it's quite a small function, so not much 
>> extra code. Additionally, all of the callback functions will have to 
>> deal with an extra parameter (that is probably ignored in all but one 
>> place).
> 
> It depends on what the combined version ends up looking like but in
> general I'm in favour of one more generic function over several cloned
> and hacked versions of the same thing.
> 
> Ian.
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.