Re: [Xen-devel] GPU passthrough performance regression in >4GB vms due to XSA-60 changes

>>> On 19.05.14 at 13:32, <tomasz.wroblewski@xxxxxxxxx> wrote:

> On 05/19/2014 01:07 PM, Jan Beulich wrote:
>>>>> On 19.05.14 at 12:47, <tomasz.wroblewski@xxxxxxxxx> wrote:
>>> On 05/19/2014 12:38 PM, Jan Beulich wrote:
>>>> So perhaps time for sending complete logs, plus suitable information
>>>> from inside the guest of how things (RAM, MMIO, MTRRs) end up being
>>>> set up?
>>> Could be, though please read the explanation I came up in the other post
>>> whether its enough, I think it makes sense... 64bit guest BARs are
>>> indeed not in use (confirmed from guest). MTRR is setup such that only
>>> the low region is UC, which is correct.
>> Yes, that's a very sensible theory, which - as just said in the other
>> reply - can be easily verified.
>>> But the RAM relocation code causes the caching on relocated region to be
>>> UC instead of WB due to the timing (very early, MTRR disabled) at which
>>> it runs, which is incorrect. I am thinking enabling MTRR during that
>>> relocation would probably fix it on 4.3
>> Except that this is a chicken and egg problem then: In order to
>> populate the variable range MTRRs, the BAR assignment (and hence
>> the prerequisite RAM relocation) need to be done already.
> I am not sure; looking at hvmloader code, wouldn't it be possible to 
> calculate the BAR locations first, then update the MTRR var ranges and 
> enable it, and only then actually write the BAR registers (from 
> precalculated info)? Presumably it's only the write part which needs to 
> be done after relocation as it causes qemu to setup mmio etc.

Leaving aside that this would require splitting pci_setup(), and
hence communicating state from its main part (RAM relocation and
resource allocation) to the final one (BAR writing), which by itself is
already not as simple a change as one would like for something that
is intended to go _only_ into the stable trees, you also already
imply with the above that we'd add a pre-enabling step for the
MTRRs. I.e. we'd end up with

- enable fixed-range MTRRs and set default to WB (no var ranges)
- pci_setup_early()
- set variable range MTRRs
- pci_setup_late()
- set MTRRs in one go on APs

Yes, that ought to work. But do we want this much diverging from
-unstable on 4.3 and 4.4? Are we certain that namely the two-stage
MTRR setup won't have any unintended side effects?

> Yeah I gave about a day of effort to port us onto unstable and test 
> there but it sadly looks to be a bigger job, so leaving that as a last 
> resort (though planning to spend couple more days on it soon).

Then as an alternative did you try pulling over the EPT changes
from -unstable?


