[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/PV: hide features dependent on XSAVE when booted with "no-xsave"
On 30/11/15 16:00, Jan Beulich wrote: >>>> On 30.11.15 at 16:38, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 30/11/15 15:22, Jan Beulich wrote: >>>>>> On 30.11.15 at 14:36, <andrew.cooper3@xxxxxxxxxx> wrote: >>>> On 30/11/15 11:30, Jan Beulich wrote: >>>>> It's not well defined whether YMM register presence >>>>> correlates to AVX, or is simply flagged by the respective XSTATE >>>>> CPUID bit (or a mixture of both). >>>> It is indeed not well defined, which is what makes this area of >>>> functionality so hard to level safely. >>>> >>>>> The minimal (and imo more natural) dependency is just the XSTATE bit. >>>> But it is wrong. >>>> >>>> Any VEX encoded SIMD operation unconditionally works on YMM state. In >>>> the case that XMM registers are encoded with a VEX prefix, the upper 128 >>>> bits of the YMM register are zeroed (SDM Vol 2, 2.3.10). This is >>>> contrary to legacy SSE instructions which preserve the upper 128 bits. >>>> >>>> Therefore, FMA, FMA4 and XOP do have a strict dependency on AVX. >>> No, if you really want to express it that way, you'll need feature >>> flags derived from the XSTATE bits. >> What? That is absurd. > Sorry, but no, this is not absurd, this is what you can derive from the > SDM without much guessing. There's nowhere the SDM makes any > connection between FMA and AVX. Intel Vol 1 14.5.3 "Detection of FMA" states: Hardware support for FMA is indicated by CPUID.1:ECX.FMA[bit 12]=1. Application Software must identify that hardware supports AVX, after that it must also detect support for FMA by CPUID.1:ECX.FMA[bit 12]. > The only connections it makes are OSXSAVE and XCR0[2:1], neither of which is > formally tied to AVX. Actually, on further reading, Intel SDM Vol 3, 2.6, Figure 2-8 states: XCR0.AVX (bit 2): If 1, AVX instructions can be executed and the XSAVE feature set can be used to manage the upper halves of the YMM registers (YMM0-YMM15 in 64-bit mode; otherwise YMM0-YMM7). This means that bit 2 has dual meaning, and is not just YMM state. This does IMO provide a formal tie between AVX and XCRO[2]. I admit that the AMD manuals are far less prescriptive than the Intel. However, AMD Vol 3 1.9 "Encoding using the VEX and XOP Prefixes" draws several conclusions, including: VEX opcode maps 1–3 are also used to encode the FMA4 and FMA instructions while the FMA/FMA4 instruction description states: The destination is either an XMM register or a YMM register, as determined by VEX.L. When the destination is an XMM register (L = 0), bits [255:128] of the corresponding YMM register are cleared. and also states that a #UD will occur if XCR0[2:1] != '11b', which is sufficient indication of FMA/FMA4 having a direct link to AVX. As for XOP, AMD Vol 4, 1.2.2.1 "XMM Register Destinations" states again that either all YMM is specified, or the the upper 128 bits are cleared if an XMM register is encoded, as well as each instruction description specifying a #UD if XCR0[2:1] != '11b'. This logically follows from the history, where XOP ended up being all the SSE5 instructions which didn't overlap with AVX. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |