Re: [Xen-devel] GPU passthrough performance regression in >4GB vms due to XSA-60 changes

>>> On 19.05.14 at 12:29, <tomasz.wroblewski@xxxxxxxxx> wrote:

> On 05/16/2014 04:36 PM, Jan Beulich wrote:
>>>>> On 16.05.14 at 13:38, <JBeulich@xxxxxxxx> wrote:
>>>>>> On 16.05.14 at 13:18, <tomasz.wroblewski@xxxxxxxxx> wrote:
>>>>> If I coded up a patch to deal with this on -unstable, would you be
>>>>> able to test that?
>>>> Willing to give it a go (xen major version updates are often problematic
>>>> to do though so can't promise success). What would your patch be doing?
>>>> Adding entries to MTRR for the relocated regions?
>>> This and properly declare the region in ACPI's _CRS. For starters I'll
>>> probably try keeping the WB default overlaid with UC variable ranges,
>>> as that's going to be the less intrusive change.
>> Okay here are two patches - the first to deal with the above mentioned
>> items, and the second to further increase correctness and at once
>> shrink the number of MTRR regions needed.
>> Afaict they apply equally well to stable-4.3, master, and staging.
>> But to be honest I don't expect any performance improvement, all
>> I'd expect is that BARs relocated above 4Gb would now get treated
>> equally to such below 4Gb - UC in all cases.
> Thanks Jan. I've tried the patches and you're correct, putting UC in 
> MTRR for the relocated region didn't help the issue. However, I had to 
> hack that manually - the codepaths to do that in your hvmloader patch 
> were not activating. The hvmloader is not programming guest pci bars to 
> 64bit regions at all, rather still programming them with 32 bit 
> regions... upon a look this seems because using_64bar conditon, as well 
> as bar64_relocate in hvmloader/pci.c is always false.

I'm confused - iirc this started out because you saw the graphics
card BARs to be put above 4Gb. And now you say they aren't being
put there. But ...

> So bar relocation to 64bit is not happening, but ram relocation as per 
> the code tagged as /* Relocate RAM that overlaps PCI space (in 64k-page 
> chunks). */ is happening. This maybe is correct (?), although I think 
> the fact that RAM is relocated but not the BAR causes the tools (i.e. 
> qemu) to lose sight of what memory is used for mmio and as you mentioned 
> in one of the previous posts, the calls which would set it to 
> mmio_direct in p2m table are not happening. Our qemu is pretty ancient 
> and doesn't support 64bit bars so its not super trivial to verify 
> whether relocating bars to 64bit would help. Trying to make sense out of 
> this..

... indeed I was apparently mis-interpreting what you said - all that
really was to be concluded from the log messages you quoted was
that RAM pages got relocated. But according to

(XEN) HVM3: Relocating guest memory for lowmem MMIO space enabled
(XEN) HVM3: Relocating 0xffff pages from 0e0001000 to 14dc00000 for lowmem MMIO 
(XEN) HVM3: Relocating 0x1 pages from 0e0000000 to 15dbff000 for lowmem MMIO 

and assuming that these were all related messages, this really isn't
a sign of using 64-bit BARs yet. All it tells us is that the PCI region
gets extended from 0xf0000000-0xfc000000 to 0xe0000000-0xfc000000.

So perhaps time for sending complete logs, plus suitable information
from inside the guest of how things (RAM, MMIO, MTRRs) end up being
set up?


