[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 00/21] xen/arm: Add support for non-pci passthrough



On Wed, Sep 10, 2014 at 12:51 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> On Wed, 2014-09-10 at 11:22 +0200, Christoffer Dall wrote:
>> On Tue, Sep 9, 2014 at 9:34 PM, Julien Grall <julien.grall@xxxxxxxxxx> wrote:
>> > (Adding Christoffer)
>> >
>> > Hi Ian,
>> >
>> > On 09/09/14 07:34, Ian Campbell wrote:
>> >>
>> >> On Thu, 2014-07-31 at 16:00 +0100, Julien Grall wrote:
>> >>>
>> >>>      - Only common device properties (interrupts, regs) are written to
>> >>>      the guest device tree. Device that needs other properties may not
>> >>> work.
>> >>
>> >>
>> >> So I've glanced through the later (more toolstack oriented) bits from
>> >> towards the end but I think there's a question of the target users which
>> >> needs thinking about before I can have a sensible opinion on those.
>> >>
>> >> As I see it the main purpose of this series is to get the underlying
>> >> plumbing in place (wiring up iommus, routing IRQs etc) to support guests
>> >> with passthrough devices, for embedded folks to use and to provide a
>> >> basis for eventual PCI passthrough functionality. I really want to see
>> >> this stuff in 4.5
>> >>
>> >> What I'm concerned about is the toolstack side. TBH I'm not very keen on
>> >> the thing with exposing very DT specific stuff like compatible strings
>> >> down from the hypervisor via domctls.
>> >> It's not really clear how best to expose this functionality, I have a
>> >> feeling that this series either goes too far or not far enough and ends
>> >> up not really satisfying anyone.
>> >
>> >
>> > I don't see many other solutions to get the compatible strings. There is no
>> > easy way to get the property from DOM0, unless we introduce a new driver in
>> > Linux.
>> >
>>
>> The toolstack you are using to create your guest must necessarily know
>> which guest it is creating, including device properties of a device a
>> user wishes to assign  It can know this because it's hardcoded,
>> included in some config files, or supplied directly by the user.  I
>> think you really want to decouple the hardware description method for
>> Dom0 from retrieving resource description about your device. Can't you
>> simply reference the device to Linux through its sysfs handle and use
>> your Xen-passthrough-layer-hypercall-magic (which I know nothing
>> about) to have Dom0 tell Xen to map/route the relevant resources?
>
> By Xen-passthrough-layer-hypercall-magic do you mean the thing which
> lets the userspace toolstack make hypercalls (which is called "privcmd"
> FWIW) or are you talking about some specific passtrhough related kernel
> driver (like VFIO? which has no Xen equivalent right now)

I was talking about the latter since I assumed the Dom0 address space
and irq number space was completely decoupled from the hardware, but I
understand now that it is not, which is good, because otherwise I'm
thinking we would be pretty screwed wrt. addresses described in the
DSDT in ACPI.

>
> If you mean the former then I think Julien's code already does this --
> it makes hypercalls telling Xen to map certain MMIO regions to guests.

I meant the latter

> What's in question is where the inputs to those hypercalls came from.
>

and how the inputs look like, but we address this later in the mail.

>> >> My suspicion is that regular folks won't really be using passthrough
>> >> until it is via PCI and that in the meantime this functionality is only
>> >> going to be used by e.g. people building embedded system and superkeen
>> >> early adopters both of whom know what they are doing and can tolerate
>> >> some hacks etc to get things working (and I think that's fine, it's
>> >> still a worthwhile set of things to get into 4.5 and those folks are
>> >> worth supporting).
>> >>
>> >> I'm also worried that we may be committing ourselves to a libxl API
>> >> already without really working through all the issues (e.g. other
>> >> properties).
>> >>
>> >> Given that I wonder if we wouldn't be better off for 4.5 supporting
>> >> something much simpler at the toolstack level, namely allowing users to
>> >> use iomem= and irq= in their domain config to map platform devices
>> >> through (already works with your series today?)
>> >
>> >
>> > This would need a bit a plumbing for irq part to allow the user choosing 
>> > the
>> > VIRQ (like Arianna did for MMIO range).
>> >
>>
>> My Xen knowledge is limited here.  Is the iomem= and irq= commands
>> given to your Dom0 toolstack, Dom0, or Xen itself?
>
> They are things that the user can write into the guest cfg file,
> containing lists of the relevant resources. e.g. to pass IRQ 42 to the
> guest: irqs = [ 42 ]
>
> The toolstack parses that and makes hypercalls while building a guest to
> tell Xen to map those regions through.
>

ok, so the irq numbers etc. are passed as-is to Xen and better
represent the same hardware resources in Dom0 as in Xen, then.

>>   How does a user
>> know which physical address in Xen's physical address space needs to
>> be remapped based on the hardware description language for Dom0?
>
> This is an interesting question. The short answer for platform device
> type things is "they just know" (they presumably have datasheets etc).
> But I think the underlying question here is whether they are given in PA
> space or dom0 IPA space, right?
>

re. the underlying question, yes.

For platform devices, I think the "they just know" doesn't really
work.  There must be a way for user space to determine the IO
addresses and IRQ numbers for a platform device, I just think it
should be completely decoupled from ACPI and DT.

> The way that this works on x86 is that dom0 sees the real underlying
> MMIO addresses in e.g. PCI BARs and /proc/iomem etc. So the hypercalls
> to map MMIO regions to guests all take real physical addresses and
> nothing has to worry about IPA vs PA issues.
>
> On ARM things are potentially a bit more complex because dom0 is running
> with second stage paging. However we always map the MMIO regions 1:1 for
> dom0, and I think we will always have to do that (the "1:1 workaround"
> refers to RAM regions only).

yeah, see my ACPI comment above.

> So I think we can continue to treat these
> things as proper physical addresses and don't need to introduce variants
> of the hypercalls which work in terms of IPAs (or you could argue that
> they already do and the translation is currently a nop).

fair enough, I guess for passthrough it's valid to always specify
things in Dom0's address/number space, because Xen already knows about
the translations to do the right thing....

>
>> It still seems to me that you need to abstract the simple concept of a
>> device passthrough vector (device handle, associated MMIO regions,
>> associated IRQs).  That would extend more nicely to PCI/ACPI as well.
>> Am I missing something?
>
> I don't think so.
>
> For PCI the toolstack would read sysfs to get at the BAR info which
> tells us the real physical addresses and it would then program those
> using the existing mechanisms.
>
>> >> and perhaps a back door
>> >>
>> >> to allow the injection a blob of DT into the guest's DT to describe
>> >> them. i.e. enough to actually get stuff done but not pretending to be
>> >> too finely integrated.
>> >
>>
>> can we not treat the problem of how to describe hardware to the guest
>> independently?
>
> Yes, that's essentially what I am proposing, get the basics working via
> iomem= and irq= now and leave the tricky problem of describing platform
> hardware until later.
>

sounds reasonable.

>>   For this matter, I still think you need some way to
>> retrieve the canonical information specific to your running instance
>> (mmio regions, irqs) and then you need to be able to create a DT/ACPI
>> tables as you want.
>
> I think might be a future nice to have but m specifically advocating
> that we not do it for 4.5. For 4.5 I think it will be acceptable to give
> the user the parts which they need to put something together, which is:
> iomem=/irq= (existing Xen interfaces), the dtc/iasl compiler and some
> way to add a predefined blob to the guests generated firmware tables.
>
> Trying to invent up an interface like you suggest simply isn't going to
> happen in time for 4.5 (the freeze of which was going to be today but
> has been slipped by a couple of weeks). And regardless of that at least
> iomem= and irq= are things which we should support anyway, since are
> existing Xen interfaces.
>

I don't keep track of the Xen development cycle (sorry, maybe I
should), but what you're saying sounds completely reasonable to me,
given that iomem= and irq= need to be supported anyhow.

>> > I would be fine with this solution for Xen 4.5. I will give a look to see
>> > what could be done.
>> >
>> >> Then we can revisit the "proper" toolstack side for 4.6. Otherwise I
>> >> fear that by the time we get the toolstack side sorted out to our
>> >> satisfaction the basic functionality (which seems to be largely done)
>> >> will have missed 4.5.
>> >
>> >
>> > Depending of the use cases, your solution based on "iomem", "irq" and DT
>> > blob might be enough for device platform passthrough.
>> >
>>
>> Hmmm, I don't really know what the Xen development policy is, but I am
>> generally not very much for creating a system solution on an OS level
>> that addresses a very limited use case without any sort of generic
>> abstractions...
>
> In the case of passthrough we prove both low level ways of configuring
> things and higher level abstractions built on top of them. e.g.
> iomem=/irq= vs pci= in a guest cfg file. iomem=/irq= let you assign
> arbitrary platform resources (subject to permissions etc) to a guest and
> in theory you could assign a PCI device manually using them. pci= lets
> you just say "assign this device please" and the toolstack sorts out the
> rest, in reality this is the one people use for PCI passthrough (for the
> obvious reason). On x86 iomme=/irq= get used for things like passing
> through serial ports to guests.
>
> The problem we are having here is that we have no existing higher level
> abstraction for platform devices (because on x86 everyone just knows the
> address of the std serial ports ;-)). We can consider adding one in the
> future, but we should still be providing iomem=/irq= anyway and that is
> something which can realistically be done for 4.5.
>
> So I think we should leave the question of what a higher level
> abstraction for passing through platform devices aside for now since it
> is a distraction from getting something useful in for 4.5, which is on a
> very tight timescale right now.
>
Understood, thanks.

-Christoffer

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.