[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (v2) Design proposal for RMRR fix



On 01/13/2015 03:47 PM, Jan Beulich wrote:
>>>> On 13.01.15 at 14:45, <George.Dunlap@xxxxxxxxxxxxx> wrote:
>> On Tue, Jan 13, 2015 at 11:03 AM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
>>>> Well it will have an impact on the overall design of the code; but
>>>> you're right, if RMRRs really can (and will) be anywhere in memory,
>>>> then the domain builder will need to know what RMRRs are going to be
>>>> reserved for this VM and avoid populating those.  If, on the other
>>>> hand, we can make some fairly small assumptions about where there will
>>>> not be any RMRRs, then we can get away with handling everything in
>>>> hvmloader.
>>>
>>> I'm not sure such fairly small assumptions can be made. For example,
>>> host RMRR may include one or several regions in host PCI MMIO
>>> space (say >3G), then hvmloader has to understand such knowledge
>>> to avoid allocating them for guest PCI MMIO.
>>
>> Yes, I'm talking here about Jan's idea of having the domain builder in
>> libxc do the minimal amount of work to get hvmloader to run, and then
>> having hvmloader populate the rest of the address space. So the
>> comparison is:
>>
>> 1. Both libxc and hvmloader know about RMRRs.  libxc uses this
>> information to avoid placing the hvmloader over an RMRR region,
>> hvmloader uses the information to populate the memory map and place
>> the MMIO ranges such that neither overlap with RMRRs.
>>
>> 2. Only hvmloader knows about RMRRs.  libxc places hvmloader in a
>> location in RAM basically guaranteed never to overlap with RMRRs;
>> hvmloader uses the information to populate memory map and place the
>> MMIO ranges such that neither overlap with RMRRs.
>>
>> #2 is only possible if we can find a region of the physical address
>> space almost guaranteed never to overlap with RMRRs.  Otherwise, we
>> may have to fall back to #1.
> 
> hvmloader loads at 0x100000, and I think we can be pretty certain
> that there's not going to be any RMRRs for that space.

Good.

>>>> I'm also not clear what assumptions "he" may be making: you mean, the
>>>> existence of an RMRR in the e820 map shouldn't be taken to mean that
>>>> he has a specific device assigned?  No, indeed, he should not make
>>>> such an assumption. :-)
>>>
>>> I meant 'he' shouldn't make assumption on how many reserved regions
>>> should exist in e820 based on exposed devices. Jan has a concern exposing
>>> more reserved regions in e820 than necessary is not a good thing. I'm
>>> trying to convince him it should be fine. :-)
>>
>> Right -- well there is a level of practicality here: if in fact many
>> operating systems ignore the e820 map and base their ideas on what
>> devices are present, then we would have to try to work around that.
>>
>> But since this is actually done by the OS and not the driver, in the
>> absence of any major OSes that actually behave this way, it seems to
>> me like taking the simpler option of assuming that the guest OS will
>> honor the e820 map should be OK.
> 
> Since your response doesn't seem connected to what Kevin said, I
> think there's some misunderstanding here: The concern Kevin
> mentioned I have is marking more regions than necessary as reserved
> in the E820 map (needlessly reducing or splitting up lowmem).

OK, so you're concerned with reducing fragmentation / maximizing
availability of lowmem.  Yes, that's another reason to try to minimize
the number of RMRRs reported in general.

Another option I was thinking about: Before assigning a device to a
guest, you have to unplug the device and assign it to pci-back (e.g.,
with xl pci-assignable-add).  In addition to something like rmmr=host,
we could add rmrr=assignable, which would add all of the RMRRs of all
devices currently listed as "assignable".  The idea would then be that
you first make all your devices assignable, then just start your guests,
and everything you've made assignable will be able to be assigned.

> 
>>> For [0xe0000, 0xeffff], leave it as a conflict (w/ guest BIOS)
>>
>> And we can't move the guest BIOS in any way?
> 
> No. BIOSes know the address they get put at. The only hope here
> is that conflicts would be only with the transiently loaded init-time
> portion of the BIOS: Typically, the BIOS has a large resident part
> in the F0000-FFFFF range, while SeaBIOS in particular has another
> init-time part living immediately below the resident one, and getting
> discarded once BIOS init was done.

I see, thanks.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.