[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Intended behavior/usage of SSBD setting


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Fri, 21 Oct 2022 21:54:36 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MTM44QIRosRDx17R3zhLd01xMJIi3N3uJ2zQrcgmz6o=; b=bRA0tZSQyy7HsZ1idMDuU0mpfq+Sa4wWl/yeQg+qYs0BqrIzLkrInGhUIsk+Mw2z5/5EqxiBzdywjGhnrMQ4yz7WxisesniGbWdEmyZgCD6VMIbxlFFHJQCL1/I0uvDuttncCo/njl+mOi/fZb/PGNFclPS3ymntiA0S0fdVKX3poYog2/vBLfmG1B/+gHHAGmyxM3Wg70q8FmsWOQm/A0i/z0bOgcBJXdUCfH+j11PjTkJF9MSbPHm8u0+PRc7W6D6RMlCf212xUB7b9QiNXR0OWhfdPVdNGhacw9JNhq16d4pQisA8OQb2G7sjGYutLGNqC3GsRr3Ia9bR2TPTqw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UvoV/FEfW3iRpuMfjlJWSMfKXfCRySU9HFYTwXLKD9j2M3lY04l3719uNmZhSn21ERCtn6ujJqmwKXoWMscvo+km4q/VmfBrvVdnvuGlRipN8qMdqmmWpfwAgFFGEYUXUdskuoYQaB3vtE2fLdMWOvxMwUfLI3hXO/6QJjavhR0SE4of2fE6aEoSzxW4ddjH+Jss4NV9pTOwMXx9AUIqrt7JpZXR2N8u6kcLwK0QiK592o7GpgjA5Jbwnks1q+seXcEHvpHrbycurZsfylkYaKdbOc5acjCt4Ov1T4yOiCr5o4rXUCYu2lIsD+vIeGTZ4noiyG5LVdazuunxlgYqLA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>
  • Delivery-date: Fri, 21 Oct 2022 21:54:50 +0000
  • Ironport-data: A9a23:0wD4Y6uON77Qmo617oLc+mBrGufnVEVfMUV32f8akzHdYApBsoF/q tZmKTiOPviJMGGkLdt/bd62/E9Xu8CHxoM3TwtlqXtnRnxD+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVKiefHgZqTZMEE8JkQhkl/MynrlmiN24BxLlk d7pqojUNUTNNwRcawr40Ire7kIy1BjOkGlA5AZnPaoS5AS2e0Q9V/rzG4ngdxMUfaEMdgKKb 76r5K20+Grf4yAsBruN+losWhRXKlJ6FVHmZkt+A8BOsDAbzsAB+v9T2M4nQVVWk120c+VZk 72hg3ASpTABZcUgkMxFO/VR/roX0aduoNcrKlDn2SCfItGvn9IBDJyCAWlvVbD09NqbDklE6 6cVCmhVfCuyrPq46omYY+lM258seZyD0IM34hmMzBn/JNN+G9XpZfyP4tVVmjAtmspJAPDSI dIDbiZiZwjBZBsJPUoLDJU5n6GjgXyXnz9w8QrJ4/ZopTWKilAouFTuGIO9ltiibMNZhEuH4 EnB+Hz0GEoyP92D0zuVtHmrg4cjmAurA9hNTePmqpaGhnW62kkSFTJIaGH4iqOJrQmnY9dmL 0w9r39GQa8asRbDosPGdw21pjuIswARX/JUEvYm80edx6zM+QGbC2MYCDlbZ7QOtsU7WDgr3 V+hhM7yCHpkt7j9dJ6G3rKdrDf3My5MK2YHPXAAVVFdv4Clp5wvhBXSSNolCLSyktD+BTD3x XaNsTQ6gLIQy8UM0s1X4Gz6vt5lnbCRJiZd2+kddjjNAt9RDGJ9W7GV1A==
  • Ironport-hdrordr: A9a23:IomuHagn0FukzRRld4z8/AbSvXBQX3l13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03IwerwQ5VpQRvnhP1ICRF4B8buYOCUghrTEGgE1/qv/9SAIVy1ygc578 tdmsdFebrN5DRB7PoSpTPIa+rIo+P3v5xA592uqUuFJDsCA84P0+46MHfjLqQcfnglOXNNLu v52iMxnUvERZ14VKSGL0hAe9KGi8zAlZrgbxJDLQUg8hOygTSh76O/OwSE3z8FOgk/gIsKwC zgqUjU96+ju/a0xlv3zGnI9albn9Pn159qGNGMsM4IMT/h4zzYJLiJGofy/wzdktvfrWrCo+ O85yvI+P4DrE85S1vF4ycFHTOQlgrGpUWSkGNwykGT3PARDAhKd/apw7gpPCcxonBQwu2Vms hwrh2knosSAhXakCvn4d/UExlsi0qvuHIn1fUelnpFTOIlGfZsRKEkjTRo+a07bVTHwZFiFP MrANDX5f5Qf1/fZ3fFvnN3yNjpWngoBB+JTkULp8TQilFt7TtE5lpdwNZakmYL9Zo7RZUB7+ PYMr5wnLULSsMNd6pyCOoIXMPyAG3QRhDHNn6UPD3cZek6EmOIr4Sy7KQ+5emsdpBNxJwumI 7ZWFcdrmI2c1KGM7z74HSKyGG5fIyQZ0Wf9igF3ekJhlTVfsuaDQSTDFYzjsCnv/ITRsXGRv fbAuMlP8Pe
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY5HNYUBb0Vi2AEkSsX9Q44WPHHa4ZZoQA
  • Thread-topic: Intended behavior/usage of SSBD setting

On 20/10/2022 12:01, Roger Pau Monné wrote:
> Hello,
>
> As part of some follow up improvements to my VIRT_SPEC_CTRL series we
> have been discussing what the usage of SSBD should be for the
> hypervisor itself.  There's currently a `spec-ctrl=ssbd` option [0],
> that has an out of date description, as now SSBD is always offered to
> guests on AMD hardware, either using SPEC_CTRL or VIRT_SPEC_CTRL.
>
> It has been pointed out by Andrew that toggling SSBD on AMD using
> VIRT_SPEC_CTRL or the non-architectural way (MSR_AMD64_LS_CFG) can
> have a high impact on performance, and hence switching it on every
> guest <-> hypervisor context switch is likely a very high
> performance penalty.
>
> It's been suggested that it could be more appropriate to run Xen with
> the guest SSBD selection on those systems, however that clashes with
> the current intent of the `spec-ctrl=ssbd` option.
>
> I hope I have captured the expressed opinions correctly in the text
> above.
>
> I see two ways to solve this:
>
>  * Keep the current logic for switching SSBD on guest <-> hypervisor
>    context switch, but only use it if `spec-ctrl=ssbd` is set on the
>    command line.
>
>  * Remove the logic for switching SSBD on guest <-> hypervisor context
>    switch, ignore setting of `spec-ctrl=ssbd` on those systems and run
>    hypervisor code with the guest selection of SSBD.
>
> Which has raised me the question of whether there's an use case
> for always running hypervisor code with SSBD enabled, or that's no
> longer relevant if we always offer guests a way for them to toggle the
> setting when required.
>
> I would like to settle on a way forward, so we can get this fixed
> before 4.17.
>
> Thanks, Roger.
>
> [0] 
> https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#spec-ctrl-x86

There are many issues at play here.  Not least that virt spec ctrl is
technically a leftover task that ought to force a re-issue of XSA-263.

Accessing MSRs (even reading) is very expensive, typically >1k cycles. 
The core CFG registers are more expensive than most, because they're
intended to be configured once after reset and then left alone.

Throughout the speculation work, we've seen crippling performance hits
from accessing MSRs in fastpaths.  The fact we're forced to use MSRs in
fastpaths even on new CPUs with built in (rather than retrofitted)
speculation support is is an area of concern still being worked on with
the CPU vendors.

Case in point.  We found for XSA-398 that toggling AMD's
MSR_SPEC_CTRL.IBRS on the PV entrypath was so bad that setting it
unilaterally behind the back of PV guests was the faster option. 
(Another todo is to stop doing this on Intel eIBRS systems, and this
will recover us a decent chunk of performance.)


SSBD mitigations are (rightly or wrongly) off by default for performance
reasons.  AMD are less affected than Intel, for microarchitectural
reasons which are discussed in relevant whitepapers, and which are
expected to remain true for future CPUs.

When Xen doesn't care about the protecting itself against SSBD by
default, I guarantee you that it will be faster to omit the MSR accesses
and run in the guest kernel's choice, than to clear the SSBD
protection.  We simply don't spend long enough in the hypervisor for the
hit against memory accesses to dwarf the hit for MSR accesses taken on
entry/exit.

The reason we put in spec-ctrl=ssbd was as a stopgap, because at the
time we didn't know how bad SSB really was, and it was decided that the
admin should have a big hammer to use if they really needed.

When Xen does care about protecting itself, the above reasoning bites
back hard.  Because we spend (or should be spending!) >99% of time in
the guest, the hit to memory accesses is far more likely to be able
dwarf the hit from the MSR accesses, but now, the dominating factor for
performance is the vmexit rate.

The problem is that if you've got a completely compute bound workload,
there are very few exits, while if you've got an IO bound workload,
there are plenty of exits.  I honestly don't know if it will be more
efficient to leave SSBD active unilaterally (whether or not we hide
this, e.g. synthesizing SSB_NO), or to let the guest run with it kernels
choice.  I suspect the answer is different with different workloads.


But, one other factor helps us.  Given that the default is fast (rather
than secure), anyone opting in to spec-ctrl=ssbd is saying "I care more
about security than performance", at which point we can simplify what we
do because we don't need to cater to everyone.


As a slight tangent, there is a cost to having too many options, which
must not be ignored.  Xen's speculation safety is far too complicated
already and needs to get more simple; this has a material impact on how
easy it is to follow, and how easy it to make changes.

It is the way it is because we've had 6 years of drip feeding one
problem after another, and haven't had the time to take a step and
design something more sensible from having 6 years of
knowledge/learnings as a basis.  There are definitely things which I
would have done differently, if 6 years ago, I'd known what I know now,
and part of the reason why the recent speculation security work has
taken so much effort is because it has involved reworking the effort
which came before, to a deadline which never has enough time to plan
properly within.


So, first question, do we care about having an "SSBD active while in
Xen" mode?

Probably yes, because we a) still don't have a working solution for PV
guests on AMD and b) who knows if there's something far worse lurking in
the future.  Sods law says that if we decide no here, it will be
critical for some future issue.

But as it's off by default and noone's made has made any noise about
having it on, we ought to prioritise simplicity.

Given that off is the default, but we know that kernels do offer it to
userspace, and it does get used by certain processes, we need to
prioritise performance.  And here, this is net system performance, not
"ensure it's off whenever it can be".  Having Xen run in the guest
kernel's choice of value will result in much better overall performance,
than trying to modify the setting in the VMentry/exit path.


Sorry that this is a very long and somewhat open ended answer, but it is
genuinely the level of complexity I grapple with on every security issue
in this area.

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.