[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] A question about PCI passthrough device BAR memory size



>>> On 29.06.12 at 01:12, Rolu <rolu@xxxxxxxx> wrote:
> I am passing through to a domU (among other things) two USB
> controllers. Here is the lspci -v output on the dom0:
> 
> 00:1d.0 USB controller: Intel Corporation Panther Point USB Enhanced
> Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>       Subsystem: ASRock Incorporation Device 1e26
>       Flags: bus master, medium devsel, latency 0, IRQ 23
>       Memory at f7d17000 (32-bit, non-prefetchable) [size=1K]
>       Capabilities: [50] Power Management version 2
>       Capabilities: [58] Debug port: BAR=1 offset=00a0
>       Capabilities: [98] PCI Advanced Features
>       Kernel driver in use: pciback
> 
> And here is the same device's output in the domU:
> 
> 00:07.0 USB controller: Intel Corporation Panther Point USB Enhanced
> Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>       Subsystem: ASRock Incorporation Device 1e26
>       Flags: bus master, medium devsel, latency 64, IRQ 44
>       Memory at f3056000 (32-bit, non-prefetchable) [size=4K]
>       Capabilities: [50] Power Management version 2
>       Kernel driver in use: ehci_hcd
> 
> The output for the other controller is essentially the same.
> 
> The peculiar thing here is that the domU thinks it has a 4K memory
> area while the dom0 says it's just 1K. The controllers work, and I
> don't know enough about the PCI subsystems to say if this could cause
> issues, but it seems things could go wrong if the domU ever decides to
> use the other 3K of memory.
> 
> I had a look at how this value was calculated. I found that the guest
> will write all ones to the BAR and then reads it, and the size of the
> memory area is determined by how many bits come back as zero (per the
> PCI specs). In qemu, in hw/pass-through.c, pt_bar_reg_write and
> pt_bar_reg_read are responsible for emulating the writing and reading.
> In pt_bar_reg_read, there is:
> 
> /* align resource size (memory type only) */
> PT_GET_EMUL_SIZE(base->bar_flag, r_size);
> 
> For memory type BAR this macro changes r_size to:
> 
> (((r_size) + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1));
> 
> This looks like it rounds r_size up to the next multiple of
> XC_PAGE_SIZE, and logging confirms this is changing r_size from 0x400
> to 0x1000. This ends up giving the guest the rounded up size, instead
> of the real size.
> 
> So,
> * is this an actual potential problem, or will something else ensure
> that the guest isn't going to try to use the extra memory?

I think it is wrong for qemu-dm to not honor the original size. A
driver handling different device versions/implementations could
look at this and adapt its behavior accordingly (and would likely
fail then).

The second aspect to this - making sure the guest doesn't access
some other guest's (or the host's) MMIO space is something to be
taken care of in the host, actually. The host has to re-assign
(or assign in the first place, should the firmware not have done
so) resources such that no two devices to be passed through to
a guest share the same PAGE_SIZE region for their MMIO blocks.

In the non-pvops kernel we have special code and command line
options for this, but I believe this became redundant with other
code and options in the upstream kernels by now (just never
got around to go in and check how much redundancy there is
and could hence be eliminated).

In any case, these are things that - afaict - need manual admin
action to get right _before_ passing through any device to a
guest.

> * if it needs fixing, how can it be done? I've looked through the code
> but I'm not sure how to fix it without breaking other things.

Since qemu ought to be able to find out the real device's BAR
sizes, it shouldn't be that difficult to make it use that value in
the config space access emulation rather than the rounded
up one - in the worst case it would have to track two values
instead of one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.