[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.



Hi Manish,

On 12/10/17 22:03, Manish Jaggi wrote:
> ACPI/IORT Support in Xen.
> --------------------------------------
> 
> I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
> the scope
> and including all that is required to support ACPI/IORT in Xen.
> Presenting for review
> first _draft_ of design of ACPI/IORT support in Xen. Not complete though.
> 
> Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
> It is proposed that IORT be parsed and the information in saved into xen
> data-structure
> say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
> etc.
> 
> Since this is first draft is open to technical comments, modifications
> and suggestions. Please be open and feel free to add any missing points
> / additions.
> 
> 1. What is IORT. What are its components ?
> 2. Current Support in Xen
> 3. IORT for Dom0
> 4. IORT for DomU
> 5. Parsing of IORT in Xen
> 6. Generation of IORT
> 7. Future Work and TODOs
> 
> 1. What is IORT. What are its components ?
> --------------------------------------------
> IORT refers to Input Output remapping table. It is essentially used to find
> information about the IO topology (PCIRC-SMMU-ITS) and relationships
> between
> devices.
> 
> A general structure of IORT is has nodes which have information about
> PCI RC,
> SMMU, ITS and Platform devices. Using an IORT table relationship between
> RID -> StreamID -> DeviceId can be obtained. More specifically which
> device is
> behind which SMMU and which interrupt controller, this topology is
> described in
> IORT Table.
> 
> RID is a requester ID in PCI context,
> StreamID is the ID of the device in SMMU context,
> DeviceID is the ID programmed in ITS.
> 
> For a non-pci device RID could be simply an ID.
> 
> Each iort_node contains an ID map array to translate from one ID into
> another.
> IDmap Entry {input_range, output_range, output_node_ref, id_count}
> This array is present in PCI RC node,SMMU node, Named component node etc
> and can reference to a SMMU or ITS node.
> 
> 2. Current Support of IORT
> ---------------------------
> Currently Xen passes host IORT table to dom0 without any modifications.
> For DomU no IORT table is passed.
> 
> 3. IORT for Dom0
> -----------------
> IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
> However few nodes could be removed removed or modified. For instance
> - host SMMU nodes should not be present
> - ITS group nodes are same as host iort but, no stage2 mapping is done
> for them.

What do you mean with stage2 mapping?

> - platform nodes (named components) may be selectively present depending
> on the case where xen is using some. This could be controlled by  xen command
> line.

Mmh, I am not so sure platform devices described in the IORT (those
which use MSIs!) are so much different from PCI devices here. My
understanding is those platform devices are network adapters, for
instance, for which Xen has no use.
So I would translate "Named Components" or "platform devices" as devices
just not using the PCIe bus (so no config space and no (S)BDF), but
being otherwise the same from an ITS or SMMU point of view.

> - More items : TODO

I think we agreed upon rewriting the IORT table instead of patching it?
So to some degree your statements are true, but when we rewrite the IORT
table without SMMUs (and possibly without other components like the
PMUs), it would be kind of a stretch to call it "fairly similar to the
host IORT". I think "based on the host IORT" would be more precise.

> 4. IORT for DomU
> -----------------
> IORT for DomU is generated by the toolstack. IORT topology is different
> when DomU supports device passthrough.

Can you elaborate on that? Different compared to what? My understanding
is that without device passthrough there would be no IORT in the first
place?

> At a minimum domU IORT should include a single PCIRC and ITS Group.
> Similar PCIRC can be added in DSDT.
> Additional node can be added if platform device is assigned to domU.
> No extra node should be required for PCI device pass-through.

Again I don't fully understand this last sentence.

> It is proposed that the idrange of PCIRC and ITS group be constant for
> domUs.

"constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or
"independent from the actual h/w"?

> In case if PCI PT,using a domctl toolstack can communicate
> physical RID: virtual RID, deviceID: virtual deviceID to xen.
> 
> It is assumed that domU PCI Config access would be trapped in Xen. The
> RID at which assigned device is enumerated would be the one provided by the
> domctl, domctl_set_deviceid_mapping
> 
> TODO: device assign domctl i/f.
> Note: This should suffice the virtual deviceID support pointed by Andre.
> [4]

Well, there's more to it. First thing: while I tried to include virtual
ITS deviceIDs to be different from physical ones, in the moment there
are fixed to being mapped 1:1 in the code.

So the first step would be to go over the ITS code and identify where
"devid" refers to a virtual deviceID and where to a physical one
(probably renaming them accordingly). Then we would need a function to
translate between the two. At the moment this would be a dummy function
(just return the input value). Later we would loop in the actual table.

> We might not need this domctl if assign_device hypercall is extended to
> provide this information.

Do we actually need a new interface or even extend the existing one?
If I got Julien correctly, the existing interface is just fine?

> 
> 5. Parsing of IORT in Xen
> --------------------------
> IORT nodes can be saved in structures so that IORT table parsing can be
> done once and is reused by all xen subsystems like ITS / SMMU etc, domain
> creation.
> Proposed are the structures to hold IORT information, very similar to ACPI
> structures.
> 
> iort_id_map {
>     range_t input_range;
>     range_t output_range;
>     void *output_reference;
> ...
> }

I guess you would need a "struct list_head list" here to chain the ranges?

> =>output_reference points to object of iort_node.
> 
> struct iort_node {
>     struct list_head id_map;
>     void *context;
>     struct list_head list;
> }
> => context could be a reference to acpi_iort_node.
> 
> struct iort_table_struct {
>     struct list_head pci_rc_nodes;
>     struct list_head smmu_nodes;
>     struct list_head plat_devices;
>     struct list_head its_group;
> }

So quickly brainstorming with Julien I was wondering if we could
actually simplify this significantly:
From Xen's point of view all we need to know is the mapping between PCI
requestor IDs (or some platform device IDs) to the physical ITS device
ID, and from requestor IDs to the SMMU stream ID.
That would be just *two* lookup tables, not connected to each other
aside from possibly having the same input ranges. At this point we could
also have *one* table, containing both the ITS deviceID and the SMMU
stream ID:

struct iort_id_map {
        range_t input_range;
        uint32_t its_devid_base;
        uint32_t smmu_streamid_base;
        struct list_head list;
};

So parsing the IORT would create and fill a list of those structures.
For a lookup we would just iterate over that list, find a matching entry
and:
return (input_id - match->input_range.base) + match->its_devid_base;

Ideally we abstract this via some functions, so that we can later swap
this for more efficient data structures should the need arise.

> This structure is created at the point IORT table is parsed say from
> acpi_iort_init.
> It is proposed to use this structure information in
> iort_init_platform_devices.
> [2] [RFC v2 4/7] ACPI: arm: Support for IORT
> 
> 6. IORT Generation
> -------------------
> There would be a common code to generate IORT table from iort_table_struct.

That sounds useful, but we would need to be careful with sharing code
between Xen and the tool stack. Has this actually been done before?

> a. For Dom0
>     the structure (iort_table_struct) be modified to remove smmu nodes
>     and update id_mappings.
>     PCIRC idmap -> output refrence to ITS group.
>     (RID -> DeviceID).
> 
>     TODO: Describe algo in update_id_mapping function to map RID ->
> DeviceID used
>     in my earlier patch [3]

If the above approach works, this would become a simple list iteration,
creating PCI rc nodes with the appropriate pointer to the ITS nodes.

> b. For DomU
>     - iort_table_struct would have minimal 2 nodes (1 PCIRC and 1 ITS
> group)
>     - populate a basic IORT in a buffer passed by toolstack( using a
> domctl : domctl_prepare_dom_iort)

I think we should reduce this to iterating the same data structure as
for Dom0. Each pass-through-ed PCI device would possibly create one
struct instance, and later on we do the same iteration as we do for
Dom0. If that proves to be simple enough, we might even live with the
code duplication between Xen and the toolstack.

Cheers,
Andre.

>     - DSDT for the DomU is updated by toolstack to include a PCIRC.
>     - If a named component is added to domU that information is passed
> in the
>     same/additional domctl.
>         - <TODO: domctl_prepare_dom_iort i/f >
>     Note: Julien I have tried to incorporate your suggestion for code
> reuse.
> 
> 7. References:
> -------------
> [0] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121667.html
> [1] ARM DEN0049C:
> http://infocenter.arm.com/help/topic/com.arm.doc.den0049c/DEN0049C_IO_Remapping_Table.pdf
> 
> [2] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123082.html
> [3] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121669.html:
> update_id_mapping function.
> [4] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123434.html
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.