[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] mm/page_alloc: always scrub pages given to the allocator


  • To: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>, "JBeulich@xxxxxxxx" <JBeulich@xxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Mon, 1 Oct 2018 14:54:05 +0100
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABzSRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT7CwYAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO/OwE0EVFq6vQEIAO2idItaUEplEemV2Q9mBA8YmtgckdLmaE0uzdDWL9To 1PL+qdNe7tBXKOfkKI7v32fe0nB4aecRlQJOZMWQRQ0+KLyXdJyHkq9221sHzcxsdcGs7X3c 17ep9zASq+wIYqAdZvr7pN9a3nVHZ4W7bzezuNDAvn4EpOf/o0RsWNyDlT6KECs1DuzOdRqD oOMJfYmtx9hMzqBoTdr6U20/KgnC/dmWWcJAUZXaAFp+3NYRCkk7k939VaUpoY519CeLrymd Vdke66KCiWBQXMkgtMGvGk5gLQLy4H3KXvpXoDrYKgysy7jeOccxI8owoiOdtbfM8TTDyWPR Ygjzb9LApA8AEQEAAcLBZQQYAQoADwIbDAUCWmTXMwUJB+tP9gAKCRCmNjwxBZC0bb+2D/9h jn1k5WcRHlu19WGuH6q0Kgm1LRT7PnnSz904igHNElMB5a7wRjw5kdNwU3sRm2nnmHeOJH8k Yj2Hn1QgX5SqQsysWTHWOEseGeoXydx9zZZkt3oQJM+9NV1VjK0bOXwqhiQyEUWz5/9l467F S/k4FJ5CHNRumvhLa0l2HEEu5pxq463HQZHDt4YE/9Y74eXOnYCB4nrYxQD/GSXEZvWryEWr eDoaFqzq1TKtzHhFgQG7yFUEepxLRUUtYsEpT6Rks2l4LCqG3hVD0URFIiTyuxJx3VC2Ta4L H3hxQtiaIpuXqq2D4z63h6vCx2wxfZc/WRHGbr4NAlB81l35Q/UHyMocVuYLj0llF0rwU4Aj iKZ5qWNSEdvEpL43fTvZYxQhDCjQTKbb38omu5P4kOf1HT7s+kmQKRtiLBlqHzK17D4K/180 ADw7a3gnmr5RumcZP3NGSSZA6jP5vNqQpNu4gqrPFWNQKQcW8HBiYFgq6SoLQQWbRxJDHvTR YJ2ms7oCe870gh4D1wFFqTLeyXiVqjddENGNaP8ZlCDw6EU82N8Bn5LXKjR1GWo2UK3CjrkH pTt3YYZvrhS2MO2EYEcWjyu6LALF/lS6z6LKeQZ+t9AdQUcILlrx9IxqXv6GvAoBLJY1jjGB q+/kRPrWXpoaQn7FXWGfMqU+NkY9enyrlw==
  • Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, "boris.ostrovsky@xxxxxxxxxx" <boris.ostrovsky@xxxxxxxxxx>, "Tim \(Xen.org\)" <tim@xxxxxxx>, "julien.grall@xxxxxxx" <julien.grall@xxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 01 Oct 2018 13:54:14 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 10/01/2018 02:44 PM, Sergey Dyasli wrote:
> On Mon, 2018-10-01 at 07:38 -0600, Jan Beulich wrote:
>>>>> On 01.10.18 at 15:12, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>
>>> On 01/10/18 12:13, Jan Beulich wrote:
>>>>>>> On 01.10.18 at 11:58, <sergey.dyasli@xxxxxxxxxx> wrote:
>>>>>
>>>>> Having the allocator return unscrubbed pages is a potential security
>>>>> concern: some domain can be given pages with memory contents of another
>>>>> domain. This may happen, for example, if a domain voluntarily releases
>>>>> its own memory (ballooning being the easiest way for doing this).
>>>>
>>>> And we've always said that in this case it's the domain's responsibility
>>>> to scrub the memory of secrets it cares about. Therefore I'm at the
>>>> very least missing some background on this change of expectations.
>>>
>>> You were on the call when this was discussed, along with the synchronous
>>> scrubbing in destroydomain.
>>
>> Quite possible, but it has been a while.
>>
>>> Put simply, the current behaviour is not good enough for a number of
>>> security sensitive usecases.
>>
>> Well, I'm looking forward for Sergey to expand on this in the commit
>> message.
>>
>>> The main reason however for doing this is the optimisations it enables,
>>> and in particular, not double scrubbing most of our pages.
>>
>> Well, wait - scrubbing != zeroing (taking into account also what you
>> say further down).
>>
>>>>> Change the allocator to always scrub the pages given to it by:
>>>>>
>>>>> 1. free_xenheap_pages()
>>>>> 2. free_domheap_pages()
>>>>> 3. online_page()
>>>>> 4. init_heap_pages()
>>>>>
>>>>> Performance testing has shown that on multi-node machines bootscrub
>>>>> vastly outperforms idle-loop scrubbing. So instead of marking all pages
>>>>> dirty initially, introduce bootscrub_done to track the completion of
>>>>> the process and eagerly scrub all allocated pages during boot.
>>>>
>>>> I'm afraid I'm somewhat lost: There still is active boot time scrubbing,
>>>> or at least I can't see how that might be skipped (other than due to
>>>> "bootscrub=0"). I was actually expecting this to change at some
>>>> point. Am I perhaps simply mis-reading this part of the description?
>>>
>>> No.  Sergey tried that, and found a massive perf difference between
>>> scrubbing in the idle loop and scrubbing at boot.  (1.2s vs 40s iirc)
>>
>> That's not something you can reasonably compare, imo: For one,
>> it is certainly expected for the background scrubbing to be slower,
>> simply because of other activity on the system. And then 1.2s
>> looks awfully small for a multi-Tb system. Yet it is mainly large
>> systems where the synchronous boot time scrubbing is a problem.
> 
> Let me throw in some numbers.
> 
> Performance of current idle loop scrubbing is just not good enough:
> on 8 nodes, 32 CPUs and 512GB RAM machine it takes ~40 seconds to scrub
> all the memory instead of ~8 seconds for current bootscrub implementation.
> 
> This was measured while synchronously waiting for CPUs to scrub all the
> memory in idle-loop. But scrubbing can happen in background, of course.

Right, the whole point of idle loop scrubbing is that you *don't*
syncronously wait for *all* the memory to finish scrubbing before you
can use part of it.  So why is this an issue for you guys -- what
concrete problem did it cause, that the full amount of memory took 40s
to finish scrubbing rather than only 8s?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.