|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 4/4] x86/PV: issue branch prediction barrier when switching 64-bit guest to kernel mode
On Tue, Dec 19, 2023 at 03:06:50PM +0100, Jan Beulich wrote:
> On 19.12.2023 12:48, Roger Pau Monné wrote:
> > On Tue, Dec 19, 2023 at 10:56:16AM +0100, Jan Beulich wrote:
> >> On 18.12.2023 18:24, Roger Pau Monné wrote:
> >>> On Tue, Feb 14, 2023 at 05:12:08PM +0100, Jan Beulich wrote:
> >>>> --- a/xen/arch/x86/pv/domain.c
> >>>> +++ b/xen/arch/x86/pv/domain.c
> >>>> @@ -455,6 +455,7 @@ static void _toggle_guest_pt(struct vcpu
> >>>> void toggle_guest_mode(struct vcpu *v)
> >>>> {
> >>>> const struct domain *d = v->domain;
> >>>> + struct cpu_info *cpu_info = get_cpu_info();
> >>>> unsigned long gs_base;
> >>>>
> >>>> ASSERT(!is_pv_32bit_vcpu(v));
> >>>> @@ -467,15 +468,21 @@ void toggle_guest_mode(struct vcpu *v)
> >>>> if ( v->arch.flags & TF_kernel_mode )
> >>>> v->arch.pv.gs_base_kernel = gs_base;
> >>>> else
> >>>> + {
> >>>> v->arch.pv.gs_base_user = gs_base;
> >>>> +
> >>>> + if ( opt_ibpb_mode_switch &&
> >>>> + !(d->arch.spec_ctrl_flags & SCF_entry_ibpb) &&
> >>>> + !VM_ASSIST(d, mode_switch_no_ibpb) )
> >>>> + cpu_info->spec_ctrl_flags |= SCF_new_pred_ctxt;
> >>>
> >>> Likewise similar to the remarks I've made before, if doing an IBPB on
> >>> entry is enough to cover for the case here, it must also be fine to
> >>> issue the IBPB right here, instead of deferring to return to guest
> >>> context?
> >>>
> >>> The only concern would be (as you mentioned before) to avoid clearing
> >>> valid Xen predictions, but I would rather see some figures about what
> >>> effect the delaying to return to guest has vs issuing it right here.
> >>
> >> Part of the reason (aiui) to do things on the exit path was to
> >> consolidate the context switch induced one and the user->kernel switch
> >> one into the same place and mechanism.
> >
> > Isn't it kind of a very specific case that we end up doing a
> > user->kernel switch as part of a context switch? IOW: would require
> > the vCPU to be scheduled out at that very specific point.
>
> No, there's no user->kernel switch at the same time as context switch.
> What I was trying to explain is that with the actual IBPB being issued
> on exit to guest, both the context switch path and the user->kernel
> mode switch path set the same indicator, for the exit path to consume.
Deferring to exit to guest path could be OK, but unless strictly
needed, which I don't think it's the case, I would request for IBPB to
be executed in C context rather than assembly one.
> >>>> + *
> >>>> + * By default (on affected and capable hardware) as a safety measure
> >>>> Xen,
> >>>> + * to cover for the fact that guest-kernel and guest-user modes are both
> >>>> + * running in ring 3 (and hence share prediction context), would issue a
> >>>> + * barrier for user->kernel mode switches of PV guests.
> >>>> + */
> >>>> +#define VMASST_TYPE_mode_switch_no_ibpb 33
> >>>
> >>> Would it be possible to define the assist as
> >>> VMASST_TYPE_mode_switch_ibpb and have it on when enabled? So that the
> >>> guest would disable it if unneeded? IMO negated options are in
> >>> general harder to understand.
> >>
> >> Negative options aren't nice, yes, but VM assists start out as all
> >> clear.
> >
> > Are you sure? I see VMASST_TYPE_pae_extended_cr3 getting set in
> > dom0_construct_pv() and that makes me wonder whether other bits
> > couldn't start set also.
> >
> > Maybe there's some restriction I'm missing, but I don't see any
> > wording in the description of the interface that states that all
> > assists are supposed to start disabled.
>
> Well, that setting of pae_extended_cr3 is in response to the kernel's
> notes section having a respective indicator. So we still only set the
> bit in response to what the kernel's asking us to do, just that here
> we carry out the request ahead of launching the kernel.
>
> Also consider what would happen during migration if there was a
> default-on assist: At the destination we can't know whether the
> source simply didn't know of the bit, or whether the guest elected to
> clear it.
Hm, I see, so I was indeed missing that aspect. VM assist is passed
as a plain bitmap, and there's no signal on which assists the VM had
available on the source side if not enabled.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |