[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (v2) Design proposal for RMRR fix

On Mon, Jan 12, 2015 at 12:28 PM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
>> From: George Dunlap
>> Sent: Monday, January 12, 2015 8:14 PM
>> On Mon, Jan 12, 2015 at 11:22 AM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> Sent: Monday, January 12, 2015 6:23 PM
>> >>
>> >> >>> On 12.01.15 at 11:12, <kevin.tian@xxxxxxxxx> wrote:
>> >> >>  From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> Sent: Monday, January 12, 2015 6:09 PM
>> >> >>
>> >> >> >>> On 12.01.15 at 10:56, <kevin.tian@xxxxxxxxx> wrote:
>> >> >> > the result is related to another open whether we want to block guest
>> >> >> > boot for such problem. If 'warn' in domain builder is acceptable, we
>> >> >> > don't need to change lowmem for such rare 1GB case, just throws
>> >> >> > a warning for unnecessary conflictions (doesn't hurt if user doesn't
>> >> >> > assign it).
>> >> >>
>> >> >> And how would you then deal with the one guest needing that
>> >> >> range reserved?
>> >> >
>> >> > if guest needs the range, then report-all or report-sel doesn't matter.
>> >> > domain builder throws the warning, and later device assignment will
>> >> > fail (or warn w/ override). In reality I think 1GB is rare. Making such
>> >> > assumption to simplify implementation is reasonable.
>> >>
>> >> One of my main problems with all you recent argumentation here
>> >> is the arbitrary use of the 1Gb boundary - there's nothing special
>> >> in this discussion with where the boundary is. Everything revolves
>> >> around the (undue) effect of report-all on domains not needing all
>> >> of the ranges found on the host.
>> >>
>> >
>> > I'm not sure which part of my argument is not clear here. report-all
>> > would be a problem here only if we want to fix all the conflictions
>> > (then pulling unnecessary devices increases the confliction possibility)
>> > in the domain builder. but if we only fix reasonable ones (e.g. >3GB)
>> > while warn other conflictions (e.g. <3G) in domain builder (let later
>> > assignment path to actually fail if confliction does matter), then we
>> > don't need to solve all conflictions in domain builder (if say 1G example
>> > fixing it may instead reduce lowmem greatly) and then report-all
>> > may just add more warnings than report-sel for unused devices.
>> You keep saying "report-all" or "report-sel", but I'm not 100% clear
>> what you mean by those.  In any case, the naming has got to be a bit
>> misleading: the important questions at the moment, AFAICT, are:
> I explained them in original proposal

Yes, I read it and didn't understand it there either. :-)

>> 1. Whether we make holes at boot time for all RMRRs on the system, or
>> whether only make RMRRs for some subset (or potentially some other
>> arbitrary range, which may include RMRRs on other hosts to which we
>> may want to migrate).
> I use 'report-all' to stand for making holes for all RMRRs on the system,
> while 'report-sel' for specified subset.
> including other RMRRs (from admin for migration) is orthogonal to
> above open.

Right; so the "report" in this case is "report to the guest".

As I said, I think that's confusing terminology; after all, we want to
report to the guest all holes that we make, and only the holes that we
make.  The question isn't then which ones we report, but which ones we
make holes for. :-)

So for this discussion, maybe "rmrr-host" (meaning, copy RMRRs from
the host) or "rmrr-sel" (meaning, specify a selection of RMRRs, which
may be from this host, or even another host)?

Given that the ranges may be of arbitrary size, and that we may want
to specify additional ranges for migration to other hosts, then I
think we need at some level we need the machinery to be in place to
specify the RMRRs that will be reserved for a specific guest.

At the xl level, there should of course be a way to specify "use all
host RMRRs"; but what should happen then is that xl / libxl should
query Xen for the host RMRRs and then pass those down to the next
layer of the library.

>> 2. Whether those holes are made by the domain builder in libxc, or by
>> hvmloader
> based on current discussion, whether to make holes in hvmloader
> doesn't bring fundamental difference. as long as domain builder
> still need to populate memory (even minimal for hvmloader to boot),
> it needs to check conflict and may ideally make hole too (though we
> may make assumption not doing that)

Well it will have an impact on the overall design of the code; but
you're right, if RMRRs really can (and will) be anywhere in memory,
then the domain builder will need to know what RMRRs are going to be
reserved for this VM and avoid populating those.  If, on the other
hand, we can make some fairly small assumptions about where there will
not be any RMRRs, then we can get away with handling everything in

>> 3. What happens if Xen is asked to assign a device and it finds that
>> the required RMRR is not empty:
>>  a. during guest creation
>>  b. after the guest has booted
> for Xen we don't need differentiate a/b. by default it's clear failure
> should be returned as it implies a security/correctness issue if
> moving forward. but based on discussion an override to 'warn' only
> is preferred, so admin can make decision (remains an open on
> whether to do global override or per-device override)

Well I think part of our confusion here is what "fail" vs "warn" means.

Fail / warn might be "Do we refuse to assign the device, or do we go
ahead and assign the device, knowing that it may act buggy?"

Or it might be, "Do we fail domain creation if at some step we
discover an RMRR conflict?  Or do we let the domain create succeed but
warn that the device has not been attached?"

I think in any case, failing to *assign* the device is the right thing
to do (except perhaps with a per-device override option).

libxl already has a policy of what happens when pci assignment fails
during domain creation.  If I'm reading the code right, libxl will
destroy the domain if libxl__device_pci_add() fails during domain
creation; I think that's the right thing to do.  If you want to change
that policy, that's a different discussion.

But if the device assignment fails due to an unspecified RMRR, that's
a bug in the toolstack -- it should have looked at the device list,
found out what RMRRs were necessary, and reserved those ranges before
we got to that point.

The only time I would expect device assignment might fail during
domain creation is if one of the devices had an RMRR shared with a
device already assigned to another VM.

>> Obviously at some point some part of the toolstack needs to identify
>> which RMRRs go with what device, so that either libxc or hvmloader can
>> make the appropriate holes in the address space; but at that point,
>> "report" is not so much the right word as "query".  (Obviously we want
>> to "report" in the e820 map all RMRRs that we've made holes for in the
>> guest.)
> yes, using 'report' doesn't catch all the changes we need to make. Just
> use them to simplify discussion in case all are on the same page. However
> clearly my original explanation didn't make it. :/
> and state my major intention again. I don't think the preparation (i.e.
> detect confliction and make holes) for device assignment should be a
> a blocking failure.  Throw warning should be enough (i.e. in libxc). We
> should let actual device assignment path to make final call based on
> admin's configuration (default 'fail' w/ 'warn' override). Based on that
> policy I think 'report-all' (making holes for all host RMRRs) is an
> acceptable approach, w/ small impact on possibly more warning
> messages (actually not bad to help admin understand the hotplug
> possibility on this platform) and show more reserved regions to the
> end user (but he shouldn't make any assumption here). :-)

I don't really understand what you're talking about here.

When the libxc domain builder runs, there is *no* guest memory mapped.
So if it has the RMRRs, then it can *avoid* conflict; and if it
doesn't have the RMRRs, it can't even *detect* conflict.  So there is
no reason for libxc to either give a warning, or cause a failure.

So I'm not sure why you think making holes for all RMRRs would have
more warning messages.

And when you say "show more reserved regions to the end user", I take
it you mean the guest kernel (via the e820 map)?

I'm also not clear what assumptions "he" may be making: you mean, the
existence of an RMRR in the e820 map shouldn't be taken to mean that
he has a specific device assigned?  No, indeed, he should not make
such an assumption. :-)

Again -- I think that the only place "rmrr-host" and "rmrr-sel" is
important is at the very top level -- in xl, and possibly at a high
level in libxl.  By the time things reach libxc and hvmloader, they
should simply be told, "These are the RMRRs for this domain", and they
should avoid conflicts and report those in the e820 map.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.