[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/vpmu_intel: Fix hypervisor crash by catching wrmsr fault
>>> On 24.04.17 at 17:44, <mohit.gambhir@xxxxxxxxxx> wrote: > On 04/21/2017 03:14 AM, Jan Beulich wrote: >>>>> On 20.04.17 at 19:49,<mohit.gambhir@xxxxxxxxxx> wrote: >>> This patch changes wrmsrl() calls to write to MSR_P6_EVTSEL register in the >>> VPMU to wrmsr_safe(). There are known (and possibly some unknown) cases >>> where >>> setting certain bits in MSR_P6_EVTSEL reg. can cause a General Protection >>> fault on some machines. Unless we catch this fault when it happens, it will >>> result in a hypervisor crash. >>> >>> For instance, setting Pin Control (PC) bit (19) in MSR_P6_EVNTSEL results >>> in a General Protection Fault on Broadwell machines and thus causes the >>> hypervisor to crash. >> I can't seem to find indication of this in the SDM. Is this removal of a >> valid bit from an architectural MSR documented anywhere? > It see it as an observed hardware bug so I don't expect it to be found > in the SDM. I reached out to > the folks at Intel (CC) but they are unaware of it. Ideally it should > work the way it says in the SDM > for all events and on all machines but it doesn't - not for the all the > events and certainly not on all > machines. Please have the description make clear this is an (assumed) erratum, until Intel either clarifies or adds an entry to the spec update for the affected model(s). >> Further a more general question: Why do you do this only for >> MSR_P6_EVNTSEL*, but not consistently for all MSRs which are >> being written with (partially) guest controlled values? I'd namely >> expect all wrmsrl() to be switched over, unless it is clear that >> we're dealing with an erratum here and we therefore can >> assume that normal guarantees for architectural MSRs apply to >> everything else. > I think it is an erratum for one specific MSR which is MSR_P6_EVTSEL >> And finally, in the description you talk about forwarding the >> fault - I can see this being the case for core2_vpmu_do_wrmsr(), >> but how does this match up with __core2_vpmu_load() and its >> callers? Namely you may report an error there after already >> having updated some MSRs. Leaving the overall set of MSRs in >> an inconsistent state is not very likely to be a good idea. > Yes, that's something I did not think about and needs to be fixed. There > are two issues here - > > 1. What should we set MSR values in case of an error? We can not revert > them because if it is part of a > context switch then we do not want to leak the values from the previous > guest to the new guest. > So one possible solution is to wipe all PMU MSRs (set them to 0) before > returning the error. > > 2. __core2_vpmu_load is called in two scenarios - One from a hyper call > XENPMU_FLUSH from do xenpmu_op() > and the other during a context switch from vpmu_switch_to(). The > hypercall seems to be handling the error > correctly but vpmu_switch_to() ignores the error. So to propagate the > error up to the guest kernel maybe we should > inject the fault into the guest using > hvm_inject_hw_exception(TRAP_gp_fault, error)/ > pv_inject_hw_exception(TRAP_gp_fault, error)? You can't arbitrarily inject an exception out of the context switch code. I'm not sure you can sensibly continue the guest in this case, unless - as you say - there is a way to place the PMU into a consistent state (and you provide _some_kind of indication to the guest about the vPMU state no longer matching its expectations). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |