[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] common/gnttab: Introduce command line feature controls



On 25/08/17 18:36, Juergen Gross wrote:
> On 25/08/17 18:21, George Dunlap wrote:
>> On 08/25/2017 01:31 PM, Jan Beulich wrote:
>>>>>> On 25.08.17 at 14:10, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>> On 25/08/17 10:57, Jan Beulich wrote:
>>>>>>>> On 24.08.17 at 17:16, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>>> On 24/08/17 16:01, Juergen Gross wrote:
>>>>>>> On 24/08/17 16:50, Andrew Cooper wrote:
>>>>>>>> --- a/docs/misc/xen-command-line.markdown
>>>>>>>> +++ b/docs/misc/xen-command-line.markdown
>>>>>>>> @@ -868,6 +868,19 @@ Controls EPT related features.
>>>>>>>>  
>>>>>>>>  Specify which console gdbstub should use. See **console**.
>>>>>>>>  
>>>>>>>> +### gnttab
>>>>>>>> +> `= List of [ max_ver:<integer>, transitive ]`
>>>>>>>> +
>>>>>>>> +> Default: `gnttab=max_ver:2,transitive`
>>>>>>>> +
>>>>>>>> +Control various aspects of the grant table behaviour available to 
>>>>>>>> guests.
>>>>>>>> +
>>>>>>>> +* `max_ver` Select the maximum grant table version to offer to 
>>>>>>>> guests.  Valid
>>>>>>>> +version are 1 and 2.
>>>>>>>> +* `transitive` Permit or disallow the use of transitive grants.  Note 
>>>>>>>> that the
>>>>>>>> +use of grant table v2 without transitive grants is an ABI breakage 
>>>>>>>> from the
>>>>>>>> +guests point of view.
>>>>>>> So shouldn't there be a way for the guest to query the support of
>>>>>>> transient grants?
>>>>>> Ideally yes, but how do you suggest doing this in a compatible way?
>>>>>>
>>>>>> All Xen downstreams which haven't backported the eventual transitive
>>>>>> fixes will have this clobber in place, without any query-ability.
>>>>> That workaround should not be used as an argument to not
>>>>> provide a way to query the capability. It was put in place knowing
>>>>> that it would cause problems for (hypothetical) guests using
>>>>> transitive grants.
>>>> I am not objecting to introducing a mechanism if a suitable one can be
>>>> found.
>>>>
>>>> However, the heritage of XSA-226 is a valid reason to not block this
>>>> patch because a mechanism isn't present.
>>> Code submission deadline for 4.10 isn't very far away; we shouldn't
>>> ship a major version with a partial workaround.
>> I'd say we shouldn't ship a major version with a risky, unused feature
>> on by default.
> You are aware that this "unused feature" is part of the public interface
> since about 8 years or so?

Like so many things from Xen's past, it shouldn't have gone in in this
state.  gnttab v2 in particular suffers massively from second system
syndrome, and is far more complicated for both Xen and guests to use
than v1.  The fact that no up-to-date guests use v2 is also a telling
datapoint.

As for this command line patch, I am going to insist on it being a 4.10
blocker, and all 4.$X.$N+1 releases.

We shipped XSA-226 with this workaround, and several downstreams
*really* *are* using it.  Even with hindsight, shipping this patch was
the correct thing to do.  There wasn't a functioning fix for XSA-226
until a week after the embargo, when OSSTest discovered that the
proposed proper fix for transitive grants broke all grant copy
operations for 32bit guests.

I'm also quite disappointed with how the post-embargo handling of
XSA-226 has gone.

Choosing to disable a feature (as opposed to fixing it) is always a
tough decision, but the timeline speaks for itself.  The decision wasn't
taken lightly.

For full transparency, the decision was taken by me (as the only person
on the security team not on holidy, or working part time), 1 week into
the 2 week embargo period, after having worked a full weekend (as well
as some Amazon engineers, hence the special thanks in the CREDITS),
trying to fix crash after crash to do with transitive grant race
conditions, only to find that once the crashes were sorted, a reference
count leak was still present.  There was no obvious fix or end in sight,
and the currently embargoed information was took what was believed to be
a theoretical issue and turn it into a very real and easy-to-exploit
heap corruption issue.  The second part of the XSA-226 fix was only
completed on the morning of public embargo, and still, testing
eventually showed that the first part was actually buggy.

At the moment, all of our downstreams which followed the embargoed
advise will be using these command line options to mitigate the
vulnerability.  The fact that this patch wasn't committed to the stable
trees is bad, because it will cause our downstreams to regress move to a
newer version of Xen and the command line options they set up no longer
have any effect.

With a believed transitive fix now in place, there is certainly room for
arguing over the default to avoid the ABI breakage.

However, all downstreams who have mitigated the vulnerability or reduce
their hypervisor attack surface by setting gnttab=max_ver:1 or
gnttab=no-transitive need this to keep on working going forwards.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.