[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH v2] xen/privcmd: Convert get_user_pages*() to pin_user_pages*()
On 6/24/20 12:41 PM, Souptick Joarder wrote: > On Wed, Jun 24, 2020 at 9:07 PM Boris Ostrovsky > <boris.ostrovsky@xxxxxxxxxx> wrote: >> On 6/23/20 9:36 PM, Souptick Joarder wrote: >>> On Tue, Jun 23, 2020 at 11:11 PM Boris Ostrovsky >>> <boris.ostrovsky@xxxxxxxxxx> wrote: >>>> On 6/23/20 7:58 AM, Souptick Joarder wrote: >>>>> In 2019, we introduced pin_user_pages*() and now we are converting >>>>> get_user_pages*() to the new API as appropriate. [1] & [2] could >>>>> be referred for more information. This is case 5 as per document [1]. >>>>> >>>>> As discussed, pages need to be marked as dirty before unpinned it. >>>>> >>>>> Previously, if lock_pages() end up partially mapping pages, it used >>>>> to return -ERRNO due to which unlock_pages() have to go through >>>>> each pages[i] till *nr_pages* to validate them. This can be avoided >>>>> by passing correct number partially mapped pages & -ERRNO separately >>>>> while returning from lock_pages() due to error. >>>>> With this fix unlock_pages() doesn't need to validate pages[i] till >>>>> *nr_pages* for error scenario. >>>> This should be split into two patches please. The first one will fix the >>>> return value bug (and will need to go to stable branches) and the second >>>> will use new routine to pin pages. >>> Initially I split the patches into 2 commits. But at last moment I >>> figure out that, >>> this bug fix ( better to call coding error, doesn't looks like lead to >>> any runtime bug) is tightly coupled to 2nd commit for >>> pin_user_pages*() conversion, >>> which means we don't need the bug fix patch if we are not converting the >>> API to >>> pin_user_pages*()/ unpin_user_pages_dirty_lock(). That's the reason to >>> clubbed these two >>> commits into a single one. >> >> I am not sure I understand why the two issues are connected. Failure of >> either get_user_pages_fast() or pin_user_pages() will result in the same >> kind of overall ioctl failure, won't it? >> > Sorry, I think, I need to go through the bug again for my clarity. > > My understanding is -> > > First, In case of success lock_pages() returns 0, so what > privcmd_ioctl_dm_op() > will return to user is depend on the return value of HYPERVISOR_dm_op() > and at last all the pages are unpinned. At present I am not clear about the > return value of HYPERVISOR_dm_op(). But this path of code looks to be correct. > > second, incase of failure from lock_pages() will return either -ENOSPC / > -ERRNO > receive from {get|pin|_user_pages_fast() inside for loop. (at that > point there might > be some partially mapped pages as it is mapping inside the loop). Upon > receiving > -ERRNO privcmd_ioctl_dm_op() will pass the same -ERRNO to user (not the > partial > mapped page count). This looks to be correct behaviour from user space. Sigh... I am sorry, you are absolutely correct. It does return -errno on get_user_pages_fast() failure. I don't know what (or whether) I was thinking. (And so Paul, ignore my question) -boris > > The problem I was trying to address is, in the second scenario when > lock_pages() returns > -ERRNO and there are already existing mapped pages. In this scenario, > when unlcok_pages() > is called it will iterate till *nr_pages* to validate and unmap the > pages. But it is supposed to > unmap only the mapped pages which I am trying to address as part of bug fix. > > Let me know if I am able to be in sync with what you are expecting ? > > >> One concern though is that we are changing how user will see this error. > My understanding from above is user will always see right -ERRNO and > will not be impacted. > > Another scenario I was thinking, if {get|pin|_user_pages_fast() return > number of pages pinned > which may be fewer than the number requested. Then lock_pages() > returns 0 to caller > and caller/user space will continue with the assumption that all pages > are pinned correctly. > Is this an expected behaviour as per current code ? > > >> Currently Xen devicemodel (which AFAIK is the only caller) ignores it, >> which is I think is wrong. So another option would be to fix this in Xen >> and continue returning positive number here. I guess it's back to Paul >> again. >> >> >> -boris >> >>
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |