[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks

On 01/29/2016 10:54 AM, Alex Williamson wrote:
> On Fri, 2016-01-29 at 02:22 +0000, Kay, Allen M wrote:
>>> -----Original Message-----
>>> From: iGVT-g [mailto:igvt-g-bounces@xxxxxxxxxxxx] On Behalf Of Alex
>>> Williamson
>>> Sent: Thursday, January 28, 2016 11:36 AM
>>> To: Gerd Hoffmann; qemu-devel@xxxxxxxxxx
>>> Cc: igvt-g@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx; Eduardo Habkost;
>>> Stefano Stabellini; Cao jin; vfio-users@xxxxxxxxxx
>>> Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset
>>> tweaks
>>> 1) The OpRegion MemoryRegion is mapped into system_memory through
>>> programming of the 0xFC config space register.
>>>  a) vfio-pci could pick an address to do this as it is realized.
>>>  b) SeaBIOS/OVMF could program this.
>>> Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick
>>> an address and mark it as e820 reserved.  I'm not sure how to pick that
>>> address.  We'd probably want to make the 0xFC config register read-
>>> only.  1.b) has the issue you mentioned where in most cases the OpRegion
>>> will be 8k, but the BIOS won't know how much address space it's mapping
>>> into system memory when it writes the 0xFC register.  I don't know how
>>> much of a problem this is since the BIOS can easily determine the size once
>>> mapped and re-map it somewhere there's sufficient space.
>>> Practically, it seems like it's always going to be 8K.  This of course 
>>> requires
>>> modification to every BIOS.  It also leaves the 0xFC register as a mapping
>>> control rather than a pointer to the OpRegion in RAM, which doesn't really
>>> match real hardware.  The BIOS would need to pick an address in this case.
>>> 2) Read-only mappings version of 1)
>>> Discussion: Really nothing changes from the issues above, just prevents any
>>> possibility of the guest modifying anything in the host.  Xen apparently 
>>> allows
>>> write access to the host page already.
>>> 3) Copy OpRegion contents into buffer and do either 1) or 2) above.
>>> Discussion: No benefit that I can see over above other than maybe allowing
>>> write access that doesn't affect the host.
>>> 4) Copy contents into a guest RAM location, mark it reserved, point to it 
>>> via
>>> 0xFC config as scratch register.
>>>  a) Done by QEMU (vfio-pci)
>>>  b) Done by SeaBIOS/OVMF
>>> Discussion: This is the most like real hardware.  4.a) has the usual issue 
>>> of
>>> how to pick an address, but the benefit of not requiring BIOS changes 
>>> (simply
>>> mark the RAM reserved via existing methods).  4.b) would require passing a
>>> buffer containing the contents of the OpRegion via fw_cfg and letting the
>>> BIOS do the setup.  The latter of course requires modifying each BIOS for 
>>> this
>>> support.
>>> Of course none of these support hotplug nor really can they since reserved
>>> memory regions are not dynamic in the architecture.
>>> In all cases, some piece of software needs to know where it can place the
>>> OpRegion in guest memory.  It seems like there are advantages or
>>> disadvantages whether that's done by QEMU or the BIOS, but we only need
>>> to do it once if it's QEMU.  Suggestions, comments, preferences?
>> Hi Alex, another thing to consider is how to communicate to the guest driver 
>> the address at 0xFC contains a valid GPA address that can be accessed by the 
>> driver without causing a EPT fault - since
>> the same driver will be used on other hypervisors and they may not EPT map 
>> OpRegion memory.  On idea proposed by display driver team is to set bit0 of 
>> the address to 1 for indicating OpRegion memory
>> can be safely accessed by the guest driver.
> Hi Allen,
> Why is that any different than a guest accessing any other memory area
> that it shouldn't?  The OpRegion starts with a 16-byte ID string, so if
> the guest finds that it should feel fairly confident the OpRegion data
> is valid.  The published spec also seems to define all bits of 0xfc as
> valid, not implying any sort of alignment requirements, and the i915
> driver does a memremap directly on the value read from 0xfc.  So I'm not
> sure whether there's really a need to or ability to define any of those
> bits in an adhoc way to indicate mapping.  If we do things right,
> shouldn't the guest driver not even know it's running in a VM, at least
> for the KVMGT-d case, so we need to be compatible with physical
> hardware.  Thanks,

I agree. EPT page fault is allowed on guest OpRegion accessing, as long as
during the page fault handling, KVM will find a proper PFN for that GPA.
It's exactly what is expected for 'normal' memory.

> Alex


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.