[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Norbert Manthey <nmanthey@xxxxxxxxx>
  • Date: Mon, 18 Feb 2019 14:49:45 +0100
  • Autocrypt: addr=nmanthey@xxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFoJQc0BEADM8Z7hB7AnW6ErbSMsYkKh4HLAPfoM+wt7Fd7axHurcOgFJEBOY2gz0isR /EDiGxYyTgxt5PZHJIfra0OqXRbWuLltbjhJACbu35eaAo8UM4/awgtYx3O1UCbIlvHGsYDg kXjF8bBrVjPu0+g55XizX6ot/YPAgmWTdH8qXoLYVZVWJilKlTqpYEVvarSn/BVgCbIsQIps K93sOTN9eJKDSqHvbkgKl9XG3WsZ703431egIpIZpfN0zZtzumdZONb7LiodcFHJ717vvd89 3Hv2bYv8QLSfYsZcSnyU0NVzbPhb1WtaduwXwNmnX1qHJuExzr8EnRT1pyhVSqouxt+xkKbV QD9r+cWLChumg3g9bDLzyrOTlEfAUNxIqbzSt03CRR43dWgfgGiLDcrqC2b1QR886WDpz4ok xX3fdLaqN492s/3c59qCGNG30ebAj8AbV+v551rsfEba+IWTvvoQnbstc6vKJCc2uG8rom5o eHG/bP1Ug2ht6m/0uWRyFq9C27fpU9+FDhb0ZsT4UwOCbthe35/wBZUg72zDpT/h5lm64G6C 0TRqYRgYcltlP705BJafsymmAXOZ1nTCuXnYAB9G9LzZcKKq5q0rP0kp7KRDbniirCUfp7jK VpPCOUEc3tS1RdCCSeWNuVgzLnJdR8W2h9StuEbb7hW4aFhwRQARAQABtCROb3JiZXJ0IE1h bnRoZXkgPG5tYW50aGV5QGFtYXpvbi5kZT6JAj0EEwEIACcFAloJQc0CGyMFCQlmAYAFCwkI BwIGFQgJCgsCBBYCAwECHgECF4AACgkQZ+8yS8zN62ajmQ/6AlChoY5UlnUaH/jgcabyAfUC XayHgCcpL1SoMKvc2rCA8PF0fza3Ep2Sw0idLqC/LyAYbI6gMYavSZsLcsvY6KYAZKeaEriG 7R6cSdrbmRcKpPjwvv4iY6G0DBTeaqfNjGe1ECY8u522LprDQVquysJIf3YaEyxoK/cLSb0c kjzpqI1P9Vh+8BQb5H9gWpakbhFIwbRGHdAF1roT7tezmEshFS0IURJ2ZFEI+ZgWgtl1MBwN sBt65im7x5gDo25h8A5xC9gLXTc4j3tk+3huaZjUJ9mCbtI12djVtspjNvDyUPQ5Mxw2Jwar C3/ZC+Nkb+VlymmErpnEUZNltcq8gsdYND4TlNbZ2JhD0ibiYFQPkyuCVUiVtimXfh6po9Yt OkE0DIgEngxMYfTTx01Zf6iwrbi49eHd/eQQw3zG5nn+yZsEG8UcP1SCrUma8p93KiKOedoL n43kTg4RscdZMjj4v6JkISBcGTR4uotMYP4M0zwjklnFXPmrZ6/E5huzUpH9B7ZIe/SUu8Ur xww/4dN6rfqbNzMxmya8VGlEQZgUMWcck+cPrRLB09ZOk4zq9i/yaHDEZA1HNOfQ9UCevXV5 7seXSX7PCY6WDAdsT3+FuaoQ7UoWN3rdpb+064QKZ0FsHeGzUd7MZtlgU4EKrh25mTSNZYRs nTz2zT/J33e5Ag0EWglBzQEQAKioD1gSELj3Y47NE11oPkzWWdxKZdVr8B8VMu6nVAAGFRSf Dms4ZmwGY27skMmOH2srnZyTfm9FaTKr8RI+71Fh9nfB9PMmwzA7OIY9nD73/HqPywzTTleG MlALmnuY6xFRSDmqmvxDHgWyzB4TgPWt8+hW3+TJKCx2RgLAdSuULZla4lia+NlS8WNRUDGK sFJCCB3BW5I/cocfpBEUqLbbmnPuD9UKpEnFcYWD9YaDNcBTjSc7iDsvtpdrBXg5VETOz/TQ /CmVs9h/5zug8O4bXxHEEJpCAxs4cGKxowBqx/XJfkwdWeo/LdaeR+LRbXvq4A32HSkyj9sV vygwt2OFEk493JGik8qtAA/oPvuqVPJGacxmZ7zKR12c0mnKCHiexFJzFbC7MSiUhhe8nNiM p6Sl6EZmsTUXhV2bd2M12Bqcss3TTJ1AcW04T4HYHVCSxwl0dVfcf3TIaH0BSPiwFxz0FjMk 10umoRvUhYYoYpPFCz8dujXBlfB8q2tnHltEfoi/EIptt1BMNzTYkHKArj8Fwjf6K+nQ3a8p 1cWfkYpA5bRqbhbplzpa0u1Ex0hZk6pka0qcVgqmH31O2OcSsqeKfUfHkzj3Q6dmuwm1je/f HWH9N1gDPEp1RB5bIxPnOG1Z4SNl9oVQJhc4qoJiqbvkciivYcH7u2CBkboFABEBAAGJAiUE GAEIAA8FAloJQc0CGwwFCQlmAYAACgkQZ+8yS8zN62YU9Q//WTnN28aBX1EhDidVho80Ql2b tV1cDRh/vWTcM4qoM8vzW4+F/Ive6wDVAJ7zwAv8F8WPzy+acxtHLkyYk14M6VZ1eSy0kV0+ RZQdQ+nPtlb1MoDKw2N5zhvs8A+WD8xjDIA9i21hQ/BNILUBINuYKyR19448/41szmYIEhuJ R2fHoLzNdXNKWQnN3/NPTuvpjcrkXKJm2k32qfiys9KBcZX8/GpuMCc9hMuymzOr+YlXo4z4 1xarEJoPOQOXnrmxN4Y30/qmf70KHLZ0GQccIm/o/XSOvNGluaYv0ZVJXHoCnYvTbi0eYvz5 OfOcndqLOfboq9kVHC6Yye1DLNGjIVoShJGSsphxOx2ryGjHwhzqDrLiRkV82gh6dUHKxBWd DXfirT8a4Gz/tY9PMxan67aSxQ5ONpXe7g7FrfrAMe91XRTf50G3rHb8+AqZfxZJFrBn+06i p1cthq7rJSlYCqna2FedTUT+tK1hU9O0aK4ZYYcRzuTRxjd4gKAWDzJ1F/MQ12ftrfCAvs7U sVbXv2TndGIleMnheYv1pIrXEm0+sdz5v91l2/TmvkyyWT8s2ksuZis9luh+OubeLxHq090C hfavI9WxhitfYVsfo2kr3EotGG1MnW+cOkCIX68w+3ZS4nixZyJ/TBa7RcTDNr+gjbiGMtd9 pEddsOqYwOs=
  • Cc: Juergen Gross <jgross@xxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Martin Pohlack <mpohlack@xxxxxxxxx>, wipawel@xxxxxxxxx, Julien Grall <julien.grall@xxxxxxx>, David Woodhouse <dwmw@xxxxxxxxxxxx>, "Martin Mazein\(amazein\)" <amazein@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julian Stecklina <jsteckli@xxxxxxxxx>, Bjoern Doebel <doebel@xxxxxxxxx>
  • Delivery-date: Mon, 18 Feb 2019 13:50:05 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 2/15/19 11:34, Jan Beulich wrote:
>>>> On 15.02.19 at 10:55, <nmanthey@xxxxxxxxx> wrote:
>> On 2/13/19 12:50, Jan Beulich wrote:
>>>>>> On 08.02.19 at 14:44, <nmanthey@xxxxxxxxx> wrote:
>>>> Guests can issue grant table operations and provide guest controlled
>>>> data to them. This data is also used for memory loads. To avoid
>>>> speculative out-of-bound accesses, we use the array_index_nospec macro
>>>> where applicable. However, there are also memory accesses that cannot
>>>> be protected by a single array protection, or multiple accesses in a
>>>> row. To protect these, a nospec barrier is placed between the actual
>>>> range check and the access via the block_speculation macro.
>>>>
>>>> As different versions of grant tables use structures of different size,
>>>> and the status is encoded in an array for version 2, speculative
>>>> execution might touch zero-initialized structures of version 2 while
>>>> the table is actually using version 1.
>>> Why zero-initialized? Did I still not succeed demonstrating to you
>>> that speculation along a v2 path can actually overrun v1 arrays,
>>> not just access parts with may still be zero-initialized?
>> I believe a speculative v2 access can touch data that has been written
>> by valid v1 accesses before, zero initialized data, or touch the NULL
>> page. Given the macros for the access I do not believe that a v2 access
>> can touch a page that is located behind a page holding valid v1 data.
> I've given examples before of how I see this to be possible. Would
> you mind going back to one of the instances, and explaining to me
> how you do _not_ see any room for an overrun there? Having
> given examples, I simply don't know how else I can explain this to
> you without knowing at what specific part of the explanation we
> diverge. (And no, I'm not excluding that I'm making up an issue
> where there is none.)
What we want to real out is that the actually use version1, while
speculation might use version2, right? I hope you refer to this example
of your earlier email.

On 1/29/19 16:11, Jan Beulich wrote:
> Let's look at an example: gref 256 points into the middle of
> the first page when using v1 calculations, but at the start
> of the second page when using v2 calculations. Hence, if the
> maximum number of grant frames was 1, we'd overrun the
> array, consisting of just a single element (256 is valid as a
> v1 gref in that case, but just out of bounds as a v2 one).

From how I read your example and my explanation, the key difference is
in the size of the shared_raw array. In case gref 256 is a valid v1
handle, then the shared_raw array has space for at least 256 entries, as
shared_raw was allocated for the number of requested entries. The access
to shared_raw is controlled with the macro shared_entry_v2:
 222 #define SHGNT_PER_PAGE_V2 (PAGE_SIZE / sizeof(grant_entry_v2_t))
 223 #define shared_entry_v2(t, e) \
 224     ((t)->shared_v2[(e)/SHGNT_PER_PAGE_V2][(e)%SHGNT_PER_PAGE_V2])
Since the direct access to the shared_v2 array depends on the
SHGNT_PER_PAGE_V2 value, this has to be less than the size of that
array. Hence, shared_raw will not be overrun (neither for version 1 nor
version 2). However, this division might result in accessing an element
of shared_raw that has not been initialized by version1 before. However,
right after allocation, shared_raw is zero initialized. Hence, this
might result in an access of the NULL page.

The second access in the macro allows to access only a single page, as
the value e is bound to the elements per page of the correct version
(the version 1 macro uses the corresponding value for the modulo
operation). Either, this refers to the NULL page, or it refers to a page
that has been initialized by version1 (partially). I do not see how an
out-of-bound access would be possible there.

>>>> @@ -963,9 +965,13 @@ map_grant_ref(
>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>                   op->ref, rgt->domain->domain_id);
>>>>  
>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>> +    block_speculation();
>>>> +
>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>      shah = shared_entry_header(rgt, op->ref);
>>>> -    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, 
>>>> op->ref);
>>>> +    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
>>>> +                                                 : &status_entry(rgt, 
>>>> op->ref);
>>> Did you consider folding the two pairs of fences you emit into
>>> one? Moving up the assignment to status ought to achieve this,
>>> as then the block_speculation() could be dropped afaict.
>>>
>>> Then again you don't alter shared_entry_header(). If there's
>>> a reason for you not having done so, then a second fence
>>> here is needed in any event.
>> Instead of the block_speculation() macro, I can also protect the op->ref
>> usage before evaluate_nospec via the array_index_nospec function.
> That's an option (as before), but doesn't help with shared_entry_header()'s
> evaluation of gt_version.
That would have to be protected separately locally in that function,
similarly to the nr_grant_entries function.
>
>>> What about the version check in nr_grant_entries()? It appears
>>> to me as if at least its use in grant_map_exists() (which simply is
>>> the first one I've found) is problematic without an adjustment.
>>> Even worse, ...
>>>
>>>> @@ -1321,7 +1327,8 @@ unmap_common(
>>>>          goto unlock_out;
>>>>      }
>>>>  
>>>> -    act = active_entry_acquire(rgt, op->ref);
>>>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>>>> +                                                       
>> nr_grant_entries(rgt)));
>>> ... you add a use e.g. here to _guard_ against speculation.
>> The adjustment you propose is to exchange the switch statement in
>> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
>> that the returned values are not speculated?
> At this point I'm not proposing a particular solution. I'm just
> putting on the table an issue left un-addressed. I certainly
> wouldn't welcome converting the switch() to an if(), even if
> right now there's no v3 on the horizon. (It's actually quite
> the inverse: If someone came and submitted a patch to change
> the various if()-s on gt_version to switch()-es, I'd welcome this.)
I am happy to add block_speculation() macros into each case of the
switch statement.
>> Already before this
>> modification the function is called and not inlined.
> How does this matter when considering speculation?
Likely not at all.
>
>> Do you want me to
>> cache the value in functions that call this method regularly to avoid
>> the penalty of the introduced lfence for each call?
> That would go back to the question of what good it does to
> latch value into a local variable when you don't know whether
> the compiler will put that variable in a register or in memory.
> IOW I'm afraid that to be on the safe side there's no way
> around the repeated LFENCEs.
The difference here would be that in case the value is stored into a
local variable (independently of memory or register) and an lfence was
executed, this value can be trusted and does not have to be checked
again, as it's no longer guest controlled.
>
>>> And what about _set_status(), unmap_common_complete(),
>>> gnttab_grow_table(), gnttab_setup_table(),
>>> release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(),
>>> several ones in gnttab_set_version(), gnttab_release_mappings(),
>>> the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(),
>>> and gnttab_get_status_frame()?
>> Protecting the function itself should allow to not modify the
>> speculation guards in these functions. I would have to check each of
>> them, whether the guest actually has control, and whether it makes sense
>> to introduce a _nospec variant of the nr_grant_entries function to not
>> penalize everywhere. Do you have an opinion on this?
> As per above, yes, I think the only sufficiently reliable way is
> to modify nr_grant_entries() itself. The question is how to
> correctly do this without replacing the switch() there, the more
> that the other change of yours has deliberately forced the
> compiler into using rows of compares instead of jump tables (not
> that I'd expect the compiler to have used a jump table here, i.e.
> the remark is just wrt the general case).

As explained above, using the block_speculation() macro in each case
statement should result in similar code than converting the switch into
a chain of if statements that are protected by evaluate_nospec macros.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.