Xen project Mailing List

Re: [PATCH for-4.17 v4 2/2] amd: remove VIRT_SC_MSR_HVM synthetic feature

To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>

Date: Tue, 15 Nov 2022 17:44:56 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wGhslgKI/dGUIazKz+iL2ZtC2TcXPMBaZeUDTEtHdOo=; b=fIUYDZtxVyCcqgWqzNj7xcRcxok54CnLwQov7Cxvb3aLQSXU6iVGV3Y+PeqgaqKkgtGYzjS68Isjv8YmQkSB4bE1fiHKylYMTs/nrF/oB8FosIaHw19NELSZWPcVflb4wSGUaTF6GVnDU+N4PHAfsO508P/oEfnC/IJX85mwNNQvPdFYZcTOLpTXDtdi4IvnWw9j0ehMbQq0D3gS7fErzZHCUi/Vq1ZJJMTJkguiqaNc/dGDiEnwNO+c4h/0YgBKjmYEC3Z647GlXM97TG4KQXobpubiPLNPcuEC6KYboKR4/etituhxh0ypyTMamVrtt1gErkZFdazbjG/9sONgAA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JO/O6lZlrfuDjIAm5UZSjvKP30SSgEi2Vw8xF1B3RmJSCv/p5/ll+ZYTJ8nyzg6hClwQ6R37choEjHQlRgNZHv1SAiyIv+7KTDkoUuALOEdeH4Xys918xY96nyZEEcuWzQTEWxmfSm4JAC03jxhaLmz431G8VQZnLf7Sn4H2Ql0rKSEXxrAsU/ztDX3Px8mJWS+Wkvd9fhhroHgY+vPp3VbK44jC5A2xjXUMtCmQnIB5ymS2eAhb0xjwPGzMIajvpkcC+tpITe/KrIIWuhx7K6i/EJ17+A2GgpJsR7q9OUtUjUHPWatVd8Slq3qKP0D962BicSRYwYwsJppw9SE6BQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 15 Nov 2022 16:45:09 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.11.2022 17:21, Andrew Cooper wrote: > On 15/11/2022 13:26, Roger Pau Monne wrote: >> Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched >> on vm{entry,exit} there's no need to use a synthetic feature bit for >> it anymore. >> >> Remove the bit and instead use a global variable. >> >> No functional change intended. >> >> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> >> Release-acked-by: Henry Wang <Henry.Wang@xxxxxxx> > > This is definitely not appropriate for 4.17, but it's a performance > regression in general, hence my firm and repeated objection to this > style of patch. > > General synthetic bits have existed for several decades longer than > alternatives. It has never ever been a rule, or even a recommendation, > to aggressively purge the non-alternative bits, because it's a provably > bad thing to do. There we are again - you state something as bad without really saying why it is bad. My view is that synthetic bits were wrong to introduce when they don't stand a chance of being used in an alternative. I agree though that there's no strong need for this to make 4.17. It may end up making backports slightly easier, as no such bit existed in 4.16. > You are attempting a micro-optimisation, that won't produce any > improvement at all in centuries, while... > >> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c >> index a332087604..9e3b9094d3 100644 >> --- a/xen/arch/x86/cpu/amd.c >> +++ b/xen/arch/x86/cpu/amd.c >> @@ -49,6 +49,7 @@ boolean_param("allow_unsafe", opt_allow_unsafe); >> /* Signal whether the ACPI C1E quirk is required. */ >> bool __read_mostly amd_acpi_c1e_quirk; >> bool __ro_after_init amd_legacy_ssbd; >> +bool __ro_after_init amd_virt_spec_ctrl; > > ... actually expending .rodata with something 8 times less efficiently > packed, and ... ... as long as you're talking of just a single CPU. The break-even is at 8 CPUs (8 bits used either way). >> --- a/xen/arch/x86/spec_ctrl.c >> +++ b/xen/arch/x86/spec_ctrl.c >> @@ -514,12 +514,12 @@ static void __init print_details(enum ind_thunk thunk, >> uint64_t caps) >> (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || >> boot_cpu_has(X86_FEATURE_SC_RSB_HVM) || >> boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) || >> - boot_cpu_has(X86_FEATURE_VIRT_SC_MSR_HVM) || >> + amd_virt_spec_ctrl || > > ... breaking apart a single TEST instruction, which not only adds an > extra conditional merge, but now hits an cold-ish cache line everywhere > it's used. > > Count how many synthetic feature bits it will actually take to change > the per-cpu data size, and realise that, when it will take more than 200 > years at the current rate of accumulation, any believe that this is an > improvement to be had disappears. > > Yes, it is only a micro regression, but you need a far better > justification than "there is a gain of 64 bytes per CPU which will be > non-theoretical in more than 200 years" when traded off vs the actual > 512 bytes, plus extra code bloat bloat, plus reduced locality of data > that this "improvement" genuinely costs today. I don't see Roger stating anything like this. I think we need to settle on at least halfway firm rules on when to use synthetic feature bits and when to use plain global booleans. Without that the tastes of the three of us are going to collide again every once in a while. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.