Re: [Xen-devel] GPU passthrough performance regression in >4GB vms due to XSA-60 changes

On 05/19/2014 02:06 PM, Jan Beulich wrote:
On 19.05.14 at 13:32, <tomasz.wroblewski@xxxxxxxxx> wrote:
On 05/19/2014 01:07 PM, Jan Beulich wrote:
On 19.05.14 at 12:47, <tomasz.wroblewski@xxxxxxxxx> wrote:
On 05/19/2014 12:38 PM, Jan Beulich wrote:
So perhaps time for sending complete logs, plus suitable information
from inside the guest of how things (RAM, MMIO, MTRRs) end up being
set up?
Could be, though please read the explanation I came up in the other post
whether its enough, I think it makes sense... 64bit guest BARs are
indeed not in use (confirmed from guest). MTRR is setup such that only
the low region is UC, which is correct.
Yes, that's a very sensible theory, which - as just said in the other
reply - can be easily verified.

But the RAM relocation code causes the caching on relocated region to be
UC instead of WB due to the timing (very early, MTRR disabled) at which
it runs, which is incorrect. I am thinking enabling MTRR during that
relocation would probably fix it on 4.3
Except that this is a chicken and egg problem then: In order to
populate the variable range MTRRs, the BAR assignment (and hence
the prerequisite RAM relocation) need to be done already.
I am not sure; looking at hvmloader code, wouldn't it be possible to
calculate the BAR locations first, then update the MTRR var ranges and
enable it, and only then actually write the BAR registers (from
precalculated info)? Presumably it's only the write part which needs to
be done after relocation as it causes qemu to setup mmio etc.
Leaving aside that this would require splitting pci_setup(), and
hence communicating state from its main part (RAM relocation and
resource allocation) to the final one (BAR writing), which by itself is
already not as simple a change as one would like for something that
is intended to go _only_ into the stable trees, you also already
imply with the above that we'd add a pre-enabling step for the
MTRRs. I.e. we'd end up with

- enable fixed-range MTRRs and set default to WB (no var ranges)
- pci_setup_early()
- set variable range MTRRs
- pci_setup_late()
- set MTRRs in one go on APs

Yes, that ought to work. But do we want this much diverging from
-unstable on 4.3 and 4.4? Are we certain that namely the two-stage
MTRR setup won't have any unintended side effects?

Yeah I gave about a day of effort to port us onto unstable and test
there but it sadly looks to be a bigger job, so leaving that as a last
resort (though planning to spend couple more days on it soon).
Then as an alternative did you try pulling over the EPT changes
from -unstable?
That would be indeed preferable, I've looked over them but couldn't figure out which particular change would fix the EPT update after MTRR enable. Do you remember which that was? I could test it and try to narrow any other commits it'd require (seems there were a lot of ept related changes)


