|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 4/4] x86/PV: issue branch prediction barrier when switching 64-bit guest to kernel mode
On Tue, Dec 19, 2023 at 10:56:16AM +0100, Jan Beulich wrote:
> On 18.12.2023 18:24, Roger Pau Monné wrote:
> > On Tue, Feb 14, 2023 at 05:12:08PM +0100, Jan Beulich wrote:
> >> Since both kernel and user mode run in ring 3, they run in the same
> >> "predictor mode".
> >
> > That only true when IBRS is enabled, otherwise all CPU modes share the
> > same predictor mode?
>
> But here we only care about ring 3 anyway?
Hm, yes, what I wanted to say is that this is not exclusive to 64bit
PV, as with IBRS disabled ring 3 and ring 0 share the same predictor
mode. Anyway, not relevant.
>
> >> @@ -753,7 +755,9 @@ static inline void pv_inject_sw_interrup
> >> * but we can't make such requests fail all of the sudden.
> >> */
> >> #define PV64_VM_ASSIST_MASK (PV32_VM_ASSIST_MASK | \
> >> - (1UL << VMASST_TYPE_m2p_strict))
> >> + (1UL << VMASST_TYPE_m2p_strict) | \
> >> + ((opt_ibpb_mode_switch + 0UL) << \
> >> + VMASST_TYPE_mode_switch_no_ibpb))
> >
> > I'm wondering that it's kind of weird to offer the option to PV domUs
> > if opt_ibpb_entry_pv is set, as then the guest mode switch will always
> > (implicitly) do a IBPB as requiring an hypercall and thus take an
> > entry point into Xen.
> >
> > I guess it's worth having it just as a way to signal to Xen that the
> > hypervisor does perform an IBPB, even if the guest cannot disable it.
>
> I'm afraid I'm confused by your reply. Not only, but also because the
> latter sentence looks partly backwards / non-logical to me.
Sorry, I think I didn't word that very well. The remark is tied to
the one below about the vmassist 'possibly' allowing the guest to
disable IBPB on guest user -> kernel context switches, but Xen might
unconditionally do additional IBPBs that the guest cannot disable.
> >> --- a/xen/arch/x86/pv/domain.c
> >> +++ b/xen/arch/x86/pv/domain.c
> >> @@ -455,6 +455,7 @@ static void _toggle_guest_pt(struct vcpu
> >> void toggle_guest_mode(struct vcpu *v)
> >> {
> >> const struct domain *d = v->domain;
> >> + struct cpu_info *cpu_info = get_cpu_info();
> >> unsigned long gs_base;
> >>
> >> ASSERT(!is_pv_32bit_vcpu(v));
> >> @@ -467,15 +468,21 @@ void toggle_guest_mode(struct vcpu *v)
> >> if ( v->arch.flags & TF_kernel_mode )
> >> v->arch.pv.gs_base_kernel = gs_base;
> >> else
> >> + {
> >> v->arch.pv.gs_base_user = gs_base;
> >> +
> >> + if ( opt_ibpb_mode_switch &&
> >> + !(d->arch.spec_ctrl_flags & SCF_entry_ibpb) &&
> >> + !VM_ASSIST(d, mode_switch_no_ibpb) )
> >> + cpu_info->spec_ctrl_flags |= SCF_new_pred_ctxt;
> >
> > Likewise similar to the remarks I've made before, if doing an IBPB on
> > entry is enough to cover for the case here, it must also be fine to
> > issue the IBPB right here, instead of deferring to return to guest
> > context?
> >
> > The only concern would be (as you mentioned before) to avoid clearing
> > valid Xen predictions, but I would rather see some figures about what
> > effect the delaying to return to guest has vs issuing it right here.
>
> Part of the reason (aiui) to do things on the exit path was to
> consolidate the context switch induced one and the user->kernel switch
> one into the same place and mechanism.
Isn't it kind of a very specific case that we end up doing a
user->kernel switch as part of a context switch? IOW: would require
the vCPU to be scheduled out at that very specific point.
> >> + *
> >> + * By default (on affected and capable hardware) as a safety measure Xen,
> >> + * to cover for the fact that guest-kernel and guest-user modes are both
> >> + * running in ring 3 (and hence share prediction context), would issue a
> >> + * barrier for user->kernel mode switches of PV guests.
> >> + */
> >> +#define VMASST_TYPE_mode_switch_no_ibpb 33
> >
> > Would it be possible to define the assist as
> > VMASST_TYPE_mode_switch_ibpb and have it on when enabled? So that the
> > guest would disable it if unneeded? IMO negated options are in
> > general harder to understand.
>
> Negative options aren't nice, yes, but VM assists start out as all
> clear.
Are you sure? I see VMASST_TYPE_pae_extended_cr3 getting set in
dom0_construct_pv() and that makes me wonder whether other bits
couldn't start set also.
Maybe there's some restriction I'm missing, but I don't see any
wording in the description of the interface that states that all
assists are supposed to start disabled.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |