|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/spec-ctrl: Fix up the RSBA/RRSBA bits as appropriate
On 01.06.2023 16:48, Andrew Cooper wrote:
> @@ -593,15 +596,85 @@ static bool __init retpoline_calculations(void)
> return false;
>
> /*
> - * RSBA may be set by a hypervisor to indicate that we may move to a
> - * processor which isn't retpoline-safe.
> + * The meaning of the RSBA and RRSBA bits have evolved over time. The
> + * agreed upon meaning at the time of writing (May 2023) is thus:
> + *
> + * - RSBA (RSB Alternative) means that an RSB may fall back to an
> + * alternative predictor on underflow. Skylake uarch and later all
> have
> + * this property. Broadwell too, when running microcode versions prior
> + * to Jan 2018.
> + *
> + * - All eIBRS-capable processors suffer RSBA, but eIBRS also introduces
> + * tagging of predictions with the mode in which they were learned. So
> + * when eIBRS is active, RSBA becomes RRSBA (Restricted RSBA).
> + *
> + * - CPUs are not expected to enumerate both RSBA and RRSBA.
> + *
> + * Some parts (Broadwell) are not expected to ever enumerate this
> + * behaviour directly. Other parts have differing enumeration with
> + * microcode version. Fix up Xen's idea, so we can advertise them safely
> + * to guests, and so toolstacks can level a VM safety for migration.
> + *
> + * The following states exist:
> + *
> + * | | RSBA | EIBRS | RRSBA | Notes | Action |
> + * |---+------+-------+-------+--------------------+---------------|
> + * | 1 | 0 | 0 | 0 | OK (older parts) | Maybe +RSBA |
> + * | 2 | 0 | 0 | 1 | Broken | +RSBA, -RRSBA |
> + * | 3 | 0 | 1 | 0 | OK (pre-Aug ucode) | +RRSBA |
> + * | 4 | 0 | 1 | 1 | OK | |
> + * | 5 | 1 | 0 | 0 | OK | |
> + * | 6 | 1 | 0 | 1 | Broken | -RRSBA |
> + * | 7 | 1 | 1 | 0 | Broken | -RSBA, +RRSBA |
> + * | 8 | 1 | 1 | 1 | Broken | -RSBA |
> *
> + * However, we doesn't need perfect adherence to the spec. Identify the
Nit: "don't" or "it"?
> + * broken cases (so we stand a chance of spotting violated assumptions),
> + * and fix up Rows 1 and 3 so Xen can use RSBA || RRSBA to identify
> + * "alternative predictors potentially in use".
Considering that it's rows 2, 6, 7, and 8 which are broken, I find this
comment a little misleading. To me it doesn't become clear whether them
subsequently being left alone (and merely a log message issued) is
intentional.
> + */
> + if ( cpu_has_eibrs ? cpu_has_rsba /* Rows 7, 8 */
> + : cpu_has_rrsba /* Rows 2, 6 */ )
> + printk(XENLOG_ERR
> + "FIRMWARE BUG: CPU %02x-%02x-%02x, ucode 0x%08x: RSBA %u,
> EIBRS %u, RRSBA %u\n",
> + boot_cpu_data.x86, boot_cpu_data.x86_model,
> + boot_cpu_data.x86_mask, ucode_rev,
> + cpu_has_rsba, cpu_has_eibrs, cpu_has_rrsba);
> +
> + /*
> * Processors offering Enhanced IBRS are not guarenteed to be
> * repoline-safe.
> */
> - if ( cpu_has_rsba || cpu_has_eibrs )
> + if ( cpu_has_eibrs )
> + {
> + /*
> + * Prior to the August 2023 microcode, many eIBRS-capable parts did
> + * not enumerate RRSBA.
> + */
> + if ( !cpu_has_rrsba )
> + setup_force_cpu_cap(X86_FEATURE_RRSBA);
> +
> + return false;
> + }
No clearing of RSBA in this case? I fear we may end up with misbehavior if
our own records aren't kept consistent with our assumptions. (This then
extends to the "just a log message" above as well.)
Also the inner conditional continues to strike me as odd; could you add
half a sentence to the comment (or description) as to meaning to leave
is_forced_cpu_cap() function correctly (which in turn raises the question
whether - down the road - this is actually going to matter)?
> + /*
> + * RSBA is explicitly enumerated in some cases, but may also be set by a
> + * hypervisor to indicate that we may move to a processor which isn't
> + * retpoline-safe.
> + */
> + if ( cpu_has_rsba )
> return false;
>
> + /*
> + * At this point, we've filtered all the legal RSBA || RRSBA cases (or
> the
> + * known non-ideal cases). If ARCH_CAPS is visible, trust the absence of
> + * RSBA || RRSBA. There's no known microcode which advertises ARCH_CAPS
> + * without RSBA or EIBRS, and if we're virtualised we can't rely the
> model
> + * check anyway.
> + */
I think "no known" wants further qualification: When IBRSB was first
introduced, EIBRS and RSBA weren't even known about. I also didn't
think all hardware (i.e. sufficiently old one) did get newer ucode
when these started to be known. Possibly you mean "... which wrongly
advertises ..."?
> @@ -689,6 +762,15 @@ static bool __init retpoline_calculations(void)
> break;
> }
>
> + if ( !safe )
> + {
> + /*
> + * Note: the eIBRS-capable parts are filtered out earlier, so the
> + * remainder here are the ones which suffer only RSBA behaviour.
> + */
As I think I had mentioned already, I find "only" here odd, when RSBA
is more severe than RRSBA. Maybe the "only" could move earlier, e.g.
between "are" and "the"? Then again this may be another non-native-
speaker issue of mine ...
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |