Xen project Mailing List

Re: [Xen-devel] [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Mon, 3 Apr 2017 18:37:11 +0100

Cc: Julien Grall <julien.grall@xxxxxxx>, Paul Durrant <paul.durrant@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Mon, 03 Apr 2017 17:37:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 03/04/17 17:08, Jan Beulich wrote: >>>> On 03.04.17 at 17:42, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 03/04/17 16:07, Jan Beulich wrote: >>>>>> On 03.04.17 at 16:27, <andrew.cooper3@xxxxxxxxxx> wrote: >>>> On 03/04/17 10:13, Jan Beulich wrote: >>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@xxxxxxxxxx> wrote: >>>>>> --- a/xen/arch/x86/hvm/hvm.c >>>>>> +++ b/xen/arch/x86/hvm/hvm.c >>>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t >>>> may_defer) >>>>>> return X86EMUL_OKAY; >>>>>> } >>>>>> >>>>>> +enum hvm_segmentation_mode hvm_seg_mode( >>>>>> + const struct vcpu *v, enum x86_segment seg, >>>>>> + const struct segment_register *cs) >>>>> The inputs here are at least somewhat counterintuitive (for example, >>>>> from an abstract pov it is unexpected that the result depends on seg >>>>> and cs). At the very least I think the naming should make clear that >>>>> cs is not just some code segment, but the CS v has currently in use >>>>> (e.g. active_cs). Going further the question is whether having this >>>>> helper is really useful (and not perhaps inviting mis-use), and hence >>>>> whether the inputs passed here wouldn't better be passed directly >>>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument >>>>> is required to match up between the two calls. >>>> I purposefully wanted to avoid people opencoding the logic and getting >>>> it wrong (looks like even I got it wrong). >>>> >>>> I'm not convinced that passing the parameters individually is better. >>>> >>>>>> +{ >>>>>> + if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) ) >>>>>> + return hvm_seg_mode_real; >>>>> What about VM86 mode? >>>> Very good point. >>>> >>>>>> + if ( hvm_long_mode_active(v) && >>>>>> + (is_x86_system_segment(seg) || cs->attr.fields.l) ) >>>>>> + return hvm_seg_mode_long; >>>>>> + >>>>>> + return hvm_seg_mode_prot; >>>>> Did you verify this actually matching real hardware behavior? There >>>>> being obvious anomalies when compat ring-0 code executes >>>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte >>>>> operands, yet 32-bit/compat code would expect 6-byte ones, so >>>>> one of the two views is necessarily wrong, and whichever it is, it >>>>> introduces an inconsistency), I wouldn't take it for given that _all_ >>>>> descriptor table accessing insns behave like they would from a >>>>> 64-bit code segment (I nevertheless assume they do, but as I >>>>> can't see it written down anywhere, we shouldn't assume so, >>>>> considering how many other oddities there are in x86). >>>>> >>>>> This question is also being supported by the SDM using the same >>>>> standard "Same exceptions as in protected mode" in the >>>>> respective insns' "Compatibility Mode Exceptions" sections, yet >>>>> the behavior above implies that #GP(0) might also result for >>>>> compat mode descriptor table accesses if the descriptor address >>>>> ends up being non-canonical. Interestingly enough the PM >>>>> doesn't separate exception specifications for 32-bit protected, >>>>> compat, and 64-bit modes. >>>> You are mistaken. >>>> >>>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a >>>> plain memory operand. >>>> >>>> When we come to the segmentation check, it will be by default >>>> %ds-relative, with size as calculated by op_bytes in the instruction >>>> emulator. >>> I think I didn't make myself clear then: I'm not at all talking about how >>> the memory access of these insns get carried out, I solely talk about >>> the size of their operands: >> I still fail to see what the size of the operands have to do with the >> choice of segmentation mode. >> >>> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit >> limit. >> >> Why? I'd expect nothing of the sort, because 32bit compat segments are >> purposefully designed to be no functional difference from regular 32bit >> protected mode segments. That includes not changing the behaviour of >> instructions like this. > Well, one can of course take the position that ring-0 compat code > is simply a useless thing. Compatibility mode segments exist for the purpose of making user code continue to work. I don't find it surprising that compatbility supervisor segments have some rough corners. > >>> Hence if _all_ system segment >>> register related insns worked consistently in long mode, the four >>> named insns would have to have 10-byte operands. >> This isn't a valid expectation to draw. >> >>> I'm pretty sure >>> they don't though, so there is _one_ anomaly already. >> Indeed they don't. In a compatibility mode segment, they have take a >> 6-byte operand, identically to 32bit mode. >> >>> With that I don't think we can rule out there being other anomalies, with >>> this >>> not being talked about explicitly anywhere in the doc. >> I don't think any of this is relevant to the correctness of this patch. > I don't question the correctness; all I question is how far it is built > upon assumptions vs actually observed (short of documented) > behavior. AMD Vol2 14.5 "Initialising Long Mode" is fairly clear on the subject. The layout and behaviour of the 4 system structures depend on > >> Without this fix, implicit accesses to system segments in a >> compatibility mode segment will truncate the resulting linear address as >> part of performing the segmentation calculations, which is obviously not >> how real hardware behaves. > I understand this. But things are a little more complicated. Just > extend your line of thinking regarding IDTR/GDTR to LDTR and > TR: Above you suggest that the former two get loaded in a fully > 32-bit mode compatible way. LTR and LLDT (as well as LAR and > LSL), however, access a descriptor table. 32-bit code would > expect an 8-byte descriptor to be read. Is compat mode code > then not allowed to make the same assumption? Hmm - the wording of LTR/LLDT in both manuals states 64bit mode, not long mode, so there is a decent chance that the compat behaviour is identical. Let me experiment. ~Andrew > I think you've > made pretty much explicit that you'd expect a 16-byte descriptor > to be loaded in this case. So ring-0 compat code operates > identically to 32-bit mode in some respects, and identically to > 64-bit code in others. Fundamentally I'd expect consistency here, > but along the lines of the usefulness remark higher up I simply > think that no-one had spent any thought on this when designing > x86-64; otherwise it would have been easy to simply disallow > ring-0 to enter compat mode, thus avoiding any inconsistencies. > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.