[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VM Feature levelling improvements proposal (draft C)

On 17/02/14 16:46, Jan Beulich wrote:
>>>> On 17.02.14 at 17:22, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> How XenServer currently does levelling
>> ======================================
>> The _Heterogeneous Pool Levelling_ support in XenServer appears to
>> predate the
>> libxc CPUID policy API, so does not currently use it.  The toolstack has a
>> table of CPU model numbers identifying whether levelling is supported.  It
>> then uses native `CPUID` instructions to look at the first four feature
>> masks,
>> and identifies the subset of features across the pool.
>> `cpuid_mask_{,extd_}{ecx,edx}` is then set on Xen's command line for
>> each host
>> in the pool, and all hosts rebooted.
>> This has several limitations:
>> * Xen and dom0 have a reduced feature set despite not needing to migrate
> Xen, at least for most features, doesn't, as it retrieves the feature
> flags before applying the mask. Dom0 indeed is being limited without
> need.

I should have worded this better.  In XenServer there are further
restrictions to Xen, mainly in the form of default command line options,
to work around PV guest bugs.  This is purely because of a lack of
per-VM feature levelling, and I am hoping to throw all of it away as
soon as a better implementation exists.

Logic such as "To boot the Ubuntu 12.04 installer on AMD
Piledriver/Bulldozer system, XSAVE and FMA4 must be hidden until the
guest admin has updated to the latest kernel and glibc" can then be
moved into the toolstack, rather than having to be blindly applied to
the entire system.  (This doesn't actually matter the latest release of
XenServer is still a Xen-4.1 based system which pre-dates XSAVE support
actually working correctly in Xen, but it is quite important to fix
before our next release).

>> * There is only a single level for all VMs in the pool
>> * The toolstack only understands 4 of the 5 possible masking MSRs, and there
>>   are now feature maps in further `CPUID` leaves which have no masking MSRs
>> Proposal for new implementation
>> ===============================
> Sounds reasonable, but is of course in need of some details when
> getting closer to actually implementing this. I'm in particular not
> in favor of an approach where three more MSR writes would be
> added to the (PV) context switch path (mostly) unconditionally.
> Jan

If there are no particular objections to the proposed design, I shall
work on a patch series which implements it, and documents its expected use.

I am also fairly loath to put more into the context switch codepath, but
I can see no other way of doing per-vm feature levelling for PV guests. 
In the hopefully common case that no masking is needed, then the MSRs
will be written once on the first switch, then never again, at which
point the overhead is a few failed conditions.

It is obviously in the toolstacks best interest to not set different
feature masks for each PV domain, and having dom0, the idle domain and
all HVM domains with the same mask will reduce the switching somewhat,
but correctness in this area to aid in safe migration is cruital.

I am open to alternate suggestions, which is why this is just a proposal
at this stage.  However, as I said - I cant see another way of doing
per-vm feature levelling.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.