[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/7] xen/gnttab: Rework resource acquisition


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 28 Jan 2021 22:56:46 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9DeVbFP0FpIngIVcFZlN10VkFski6dZ90h4Ldz3bZQ4=; b=gKFH1cJ89md7m13QjxzZ8ki3kmfZbZ8Y654ikO0y0upDiwvE1xWstIinA6p1guIxKCGE8Hj/qcrozRZznIPOdy9Ney1FzR2eatmGeTmsAvpiBJMRVydz6GmJpUVJGgP5UoMEvZQ6t+go15OwGR/ZWRsnfwhpksOww3fcqoaPKQS8yhSLvQ2E2+pQVcBahCTKWko+Qg+lTbSwPdBkfw0JT9f7OthySNi8Ion4RVMM1Uvt35Rq5nACtVl5yGMwXk6tlCNvyV8XkJthJa9nBNTST7tDrH2wF/KJPo6BF+lBZe77s69UO6J8CzXKp2qnCBpqEGaSFffIV6qyxlkHoae4Kw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FdVINZUPEbru/6TH05q8JshZwphVskVG5kmvfiS0fvXj8ZuZlkt6qlt4J1QcowHZc1tVkzvlRTVGduYXqYBNnsh4WErEixKEoCHREa87z4jgFaLxbGnsmuT+VNlkUwZOH/Iw2c2+50lFyL9aIw04R+Id4nfMy5W1OAkJ3WNgzSVOej4P1Y2XCtzKqMQcLTLGUDpg0E57+GOLB8vWg7BgErMKABUHB135JSslRARIUkEuEqJZa/9sB5cWORHd/K1yguc1tEdMF8hHZFgiPumpyea1QQ1J0fFzLeXv8JxGkcNXJuDgjr9MibhAtZ/xo9jJIj1h9hdDfPRzE2ddv/+XTQ==
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, Michał Leszczyński <michal.leszczynski@xxxxxxx>, Hubert Jasudowicz <hubert.jasudowicz@xxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 28 Jan 2021 22:57:00 +0000
  • Ironport-sdr: 2kzkWGxNWVhNsyAQiIXQ8OoD+hxaUkh4x14rKz/4INmouBqr/4Vb81JG5omsSKln7K/QUQt5Px Sg02zWMX8p1PSXPk2sDYEq5aBH+QQmFWhaAG2n5odA8x35G3kggNZFJTEpcGW3XN1BZo8brI60 wMH2Ys64WusdYajIxmuC3+6CRfWy3tyMJhDshWVoxWvr5BD14iQXDd8nk6pQFz7CV+JWw3JQto AE08WkxSJfHUQyLRcZS5l7OUA89L6NioyaN2PWzWRTIlhuZrvT77CpE4v27l+6tQLPdybziPDm kuw=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18/01/2021 08:23, Jan Beulich wrote:
> On 15.01.2021 17:57, Andrew Cooper wrote:
>> On 15/01/2021 11:56, Jan Beulich wrote:
>>>> +    /* Grow table if necessary. */
>>>> +    grant_write_lock(gt);
>>>> +    rc = -EINVAL;
>>>> +    switch ( id )
>>>> +    {
>>>> +    case XENMEM_resource_grant_table_id_shared:
>>>> +        vaddrs = gt->shared_raw;
>>>> +        rc = gnttab_get_shared_frame_mfn(d, tot_frames - 1, &tmp);
>>> ... this will degenerate (and still cause an error) when frame
>>> is also zero, and will cause undue growing of the table when
>>> frame is non-zero yet not overly large.
>> Urgh, yes - that is why I had the check.
>>
>> In which case I retract my change between v2 and v3 here.
>>
>>> As indicated before, I'm of the clear opinion that here - like
>>> elsewhere - a number of zero frames requested means that no
>>> action be taken at all, and success be returned.
>> The general world we work in (POSIX) agrees with my opinion over yours
>> when it comes to this matter.
> I assume you are referring to mmap()? I ask because I think there
> are numerous counter examples (some even in the C standard):
> malloc() & friends allow for either behavior. memcpy() / memmove()
> ...

This entire infrastructure lives behind an mmap() (or equivalent) system
call.  Other specs are not relevant.

Any request for a zero sized mapping is a bug.  It is either a buggy
caller, or buggy continuation logic.

>> I spent a lot of time and effort getting this logic correct in v2, and I
>> do not have any further time to waste adding complexity to support a
>> non-existent corner case, nor is it reasonable to further delay all the
>> work which is depending on this series.  This entire mess is already too
>> damn complicated, without taking extra complexity.
>>
>> Entertaining the idea of supporting 0 length requests is really not as
>> simple as you seem to think it is, and is a large part of why I'm
>> stubbornly refusing to do so.
> I'd be really happy to be educated of the complications; sadly
> so far you've only claimed ones would exist without actually
> going into sufficient detail. In particular I don't view placing 
>
>     if ( size == 0 )
>         return 0;
>
> suitably early coming anywhere near "complexity". Even more so
> that as per your reply you mean to undo removal of a respective
> check, just that in your version it'll return an error instead
> of success.

I am not being a belligerent arse for the sake of it.  I've got far
better things I ought to be doing with my time.

I spent a lot of time getting this working, and with sufficient testing
to demonstrate it working in practice, particularly in continuation cases.

You've spent a lot of time and effort insisting that I reintroduce
support a fundamentally broken corner case, despite my best attempts to
demonstrate why it will livelock in the hypervisor because of the mess
which is the double looping between the compat and native handlers.

You've also spent a lot of time obfuscating the overflow checks and
breaking them in the process.

You are welcome, in your own time - and not mine, to use the testing
infrastructure I've already provided to discover why the compat mess has
a habit of livelocking in certain continuation cases.  (It won't
actually livelock if you switch to using return 0.  You'll instead hit
the ASSERT_UNREACHABLE() I put in a failsafe path specifically to avoid
bugs in this are turning back into XSAs.)

Combined with the fact that 0 length requests are buggy in all
circumstances, I chose to implement logic which is robust even in the
case of failure, because the compat logic is almost intractable and
borderline untestable because noone runs 32bit kernels in anger these
days.  I can't commit my test infrastructure which spots the problems,
because we obviously can't have a hypercall which panics when the input
buffer doesn't match the test pattern.

I am not inclined to adopt an approach which is fundamentally more
likely to contain security vulnerabilities in a fragile and untestable
area of code.  You are going to have to come up with a far more
compelling argument that "because I want to support 0 length mapping
requests" to change my mind.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.