[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (v2) Design proposal for RMRR fix

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Mon, 12 Jan 2015 11:25:56 +0000
Cc: "wei.liu2@xxxxxxxxxx" <wei.liu2@xxxxxxxxxx>, "ian.campbell@xxxxxxxxxx" <ian.campbell@xxxxxxxxxx>, "stefano.stabellini@xxxxxxxxxxxxx" <stefano.stabellini@xxxxxxxxxxxxx>, "tim@xxxxxxx" <tim@xxxxxxx>, "ian.jackson@xxxxxxxxxxxxx" <ian.jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>, "Chen, Tiejun" <tiejun.chen@xxxxxxxxx>
Delivery-date: Mon, 12 Jan 2015 11:26:04 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Fri, Jan 9, 2015 at 2:43 AM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
>> From: George Dunlap
>> Sent: Thursday, January 08, 2015 8:55 PM
>>
>> On Thu, Jan 8, 2015 at 12:49 PM, George Dunlap
>> <George.Dunlap@xxxxxxxxxxxxx> wrote:
>> > If RMRRs almost always happen up above 2G, for example, then a simple
>> > solution that wouldn't require too much work would be to make sure
>> > that the PCI MMIO hole we specify to libxc and to qemu-upstream is big
>> > enough to include all RMRRs.  That would satisfy the libxc and qemu
>> > requirements.
>> >
>> > If we then store specific RMRRs we want included in xenstore,
>> > hvmloader can put them in the e820 map, and that would satisfy the
>> > hvmloader requirement.
>>
>> An alternate thing to do here would be to "properly" fix the
>> qemu-upstream problem, by making a way for hvmloader to communicate
>> changes in the gpfn layout to qemu.
>>
>> Then hvmloader could do the work of moving memory under RMRRs to
>> higher memory; and libxc wouldn't need to be involved at all.
>>
>> I think it would also fix our long-standing issues with assigning PCI
>> devices to qemu-upstream guests, which up until now have only been
>> worked around.
>>
>
> could you elaborate a bit for that long-standing issue?

So qemu-traditional didn't particularly expect to know the guest
memory layout.  qemu-upstream does; it expects to know what areas of
memory are guest memory and what areas of memory are unmapped.  If a
read or write happens to a gpfn which *xen* knows is valid, but which
*qemu-upstream* thinks is unmapped, then qemu-upstream will crash.

The problem though is that the guest's memory map is not actually
communicated to qemu-upstream in any way.  Originally, qemu-upstream
was only told how much memory the guest had, and it just "happens" to
choose the same guest memory layout as the libxc domain builder does.
This works, but it is bad design, because if libxc were to change for
some reason, people would have to simply remember to also change the
qemu-upstream layout.

Where this really bites us is in PCI pass-through.  The default <4G
MMIO hole is very small; and hvmloader naturally expects to be able to
make this area larger by relocating memory from below 4G to above 4G.
It moves the memory in Xen's p2m, but it has no way of communicating
this to qemu-upstream.  So when the guest does an MMIO instuction that
causes qemu-upstream to access that memory, the guest crashes.

There are two work-arounds at the moment:
1. A flag which tells hvmloader not to relocate memory
2. The option to tell qemu-upstream to make the memory hole larger.

Both are just work-arounds though; a "proper fix" would be to allow
hvmloader some way of telling qemu that the memory has moved, so it
can update its memory map.

This will (I'm pretty sure) have an effect on RMRR regions as well,
for the reasons I've mentioned above: whether make the "holes" for the
RMRRs in libxc or in hvmloader, if we *move* that memory up to the top
of the address space (rather than, say, just not giving that RAM to
the guest), then qemu-upstream's idea of the guest memory map will be
wrong, and will probably crash at some point.

Having the ability for hvmloader to populate and/or move the memory
around, and then tell qemu-upstream what the resulting map looked
like, would fix both the MMIO-resize issue and the RMRR problem, wrt
qemu-upstream.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] (v2) Design proposal for RMRR fix
  - From: Pasi Kärkkäinen

References:
- Re: [Xen-devel] (v2) Design proposal for RMRR fix
  - From: George Dunlap
- Re: [Xen-devel] (v2) Design proposal for RMRR fix
  - From: George Dunlap
- Re: [Xen-devel] (v2) Design proposal for RMRR fix
  - From: Tian, Kevin

Prev by Date: Re: [Xen-devel] (v2) Design proposal for RMRR fix
Next by Date: Re: [Xen-devel] [PATCH RFC] xen-time: decreasing the rating of the xen clocksource below that of the tsc clocksource for dom0's
Previous by thread: Re: [Xen-devel] (v2) Design proposal for RMRR fix
Next by thread: Re: [Xen-devel] (v2) Design proposal for RMRR fix
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.