[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 00/34] arm64: Dom0 ITS emulation



On Wed, 14 Jun 2017, Andre Przywara wrote:
> Hi,
> 
> hopefully the final version, with only nits from v11 addressed.
> The same restriction as for the previous versions  still apply: the locking
> is considered somewhat insufficient and will be fixed by an upcoming rework.
> 
> Patches 01/34 and 02/34 should be applied for 4.9 still, since they fix
> existing bugs.
> 
> The minor comments on v11 have been addressed and the respective tags
> have been added. For a changelog see below (which omits typo fixes).
> 
> I dropped Julien's Acked-by from patch 25/34 (MAPD), since I changed
> it slightly after Stefano's comment.

I committed the series, thanks and congratulations!


> ----------------------------------
> This series adds support for emulation of an ARM GICv3 ITS interrupt
> controller. For hardware which relies on the ITS to provide interrupts for
> its peripherals this code is needed to get a machine booted into Dom0 at
> all. ITS emulation for DomUs is only really useful with PCI passthrough,
> which is not yet available for ARM. It is expected that this feature
> will be co-developed with the ITS DomU code. However this code drop here
> considered DomU emulation already, to keep later architectural changes
> to a minimum.
> 
> This is technical preview version to allow early testing of the feature.
> Things not (properly) addressed in this release:
> - There is only support for Dom0 at the moment. DomU support is only really
> useful with PCI passthrough, which is not there yet for ARM.
> - The MOVALL command is not emulated. In our case there is really nothing
> to do here. We might need to revisit this in the future for DomU support.
> - The INVALL command might need some rework to be more efficient. Currently
> we iterate over all mapped LPIs, which might take a bit longer.
> - Indirect tables are not supported. This affects both the host and the
> virtual side.
> - The ITS tables inside (Dom0) guest memory cannot easily be protected
> at the moment (without restricting access to Xen as well). So for now
> we trust Dom0 not to touch this memory (which the spec forbids as well).
> - With malicious guests (DomUs) there is a possibility of an interrupt
> storm triggered by a device. We would need to investigate what that means
> for Xen and if there is a nice way to prevent this. Disabling the LPI on
> the host side would require command queuing, which has its downsides to
> be issued during runtime.
> - Dom0 should make sure that the ITS resources (number of LPIs, devices,
> events) later handed to a DomU are really limited, as a large number of
> them could mean much time spend in Xen to initialize, free or handle those.
> It is expected that the toolstack sets up a tailored ITS with just enough
> resources to accommodate the needs of the actual passthrough-ed device(s).
> - The command queue locking is currently suboptimal and should be made more
> fine-grained in the future, if possible.
> - Provide support for running with an IOMMU, to map the doorbell page
> to all devices.
> 
> 
> Some generic design principles:
> 
> * The current GIC code statically allocates structures for each supported
> IRQ (both for the host and the guest), which due to the potentially
> millions of LPI interrupts is not feasible to copy for the ITS.
> So we refrain from introducing the ITS as a first class Xen interrupt
> controller, also we don't hold struct irq_desc's or struct pending_irq's
> for each possible LPI.
> Fortunately LPIs are only interesting to guests, so we get away with
> storing only the virtual IRQ number and the guest VCPU for each allocated
> host LPI, which can be stashed into one uint64_t. This data is stored in
> a two-level table, which is both memory efficient and quick to access.
> We hook into the existing IRQ handling and VGIC code to avoid accessing
> the normal structures, providing alternative methods for getting the
> needed information (priority, is enabled?) for LPIs.
> Whenever a guest maps a device, we allocate the maximum required number
> of struct pending_irq's, so that any triggering LPI can find its data
> structure. Upon the guest actually mapping the LPI, this pointer to the
> corresponding pending_irq gets entered into a radix tree, so that it can
> be quickly looked up.
> 
> * On the guest side we (later will) have to deal with malicious guests
> trying to hog Xen with mapping requests for a lot of LPIs, for instance.
> As the ITS actually uses system memory for storing status information,
> we use this memory (which the guest has to provide) to naturally limit
> a guest. Whenever we need information from any of the ITS tables, we
> temporarily map them (which is cheap on arm64) and copy the required data.
> 
> * An obvious approach to handling some guest ITS commands would be to
> propagate them to the host, for instance to map devices and LPIs and
> to enable or disable LPIs.
> However this (later with DomU support) will create an attack vector, as
> a malicious guest could try to fill the host command queue with
> propagated commands.
> So we try to avoid this situation: Dom0 sending a device mapping (MAPD)
> command is the only time we allow queuing commands to the host ITS command
> queue, as this seems to be the only reliable way of getting the
> required information at the moment. However at the same time we map all
> events to LPIs already, also enable them. This avoids sending commands
> later at runtime, as we can deal with mappings and LPI enabling/disabling
> internally.
> 
> To accomodate the tech preview nature of this feature at the moment, there
> is a Kconfig option to enable it. Also it is supported on arm64 only, which
> will most likely not change in the future.
> This leads to some hideous constructs like an #ifdef'ed header file with
> empty function stubs to accomodate arm32 and non-ITS builds, which share
> some generic code paths with the ITS emulation.
> The number of supported LPIs can be limited on the command line, in case
> the number reported by the hardware is too high. As Xen cannot foresee how
> many interrupts the guests will need, we cater for as many as possible.
> The command line parameter is called max-lpi-bits and expresses the number
> of bits required to hold an interrupt ID. It defaults to 20, if that is
> lower than the number supported by the hardware.
> 
> This code boots Dom0 on an ARM Fast Model with ITS support. I tried to
> address the issues seen by people running the previous versions on real
> hardware, though couldn't verify this here for myself.
> So any testing, bug reports (and possibly even fixes) are very welcome.
> 
> The code can also be found on the its/v12 branch here:
> git://linux-arm.org/xen-ap.git
> http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/v12
> 
> Cheers,
> Andre
> 
> Changelog v11 ... v12:
> - [01/34]: avoid too long line by using temporary variable
> - [05/34]: move lock directly before the function call
> - [06/34]: move lock back to cover irq_to_pending()
> - [07/34]: rename function to gic_remove_irq_from_queues()
> - [25/34]: use new gic_remove_irq_from_queues() function
> 
> Changelog v10 ... v11:
> - [01/34]: use proper ACCESS_ONCE wrappers for all users
> - [02/34]: split of v9-12/32 for backporting
> - [03/34]: remainder of splitting v9-12/32
> - [04/34]: extend comment to justify 10 INITD bits
> - [05/34]: swapped place with former 04/32 to fix temporary deadlock
> - [06/34]: swapped place with former 03/32 to fix temporary deadlock
> - [08/34]: update commit message
> - [10/34]: remove unneeded p->lr initialisation
> - [11/34]: replace read_atomic with ACCESS_ONCE
> - [13/34]: split of former 11/32: remove no longer needed vCPU ID in host LPI
> - [14/34]: export inject_lpi, fix lock-less access to vCPU ID
> - [18/34]: fix commit message, add one more memory barrier and comment
> - [19/34]: fix its_cmd_mask_field(), dump ITS command on error
> - [20/34]: drop lock-taking versions of {read,write}_itte, drop optional vCPU
>            parameter to write_itte(), remove sanity checks (dony by callers)
> - [21/34]: fix "veventid" parameter name
> - [22/34]: explicitly take lock around read_itte, use vgic_vcpu_inject_lpi()
> - [25/34]: fix "veventid" parameter name
> - [26/34]: add unlikely() hints
> - [27/34]: remove unneeded read_atomic(), explicitly sanitize collection
> - [28/34]: extend IRQ migration comment
> - [30/34]: protects against not valid property table
> - [31/34]: protects against not valid property table, fix error propagation
> 
> Changelog v9 ... v10:
> no changes to 06, 07, 12, 15, 20, 29, 30, 32
> - [01/32]: fix for rank lock deadlock as posted before
> - [02/32]: replace arbitrary 16 bits for DomU interrupt IDs with 10 bits
> - [03/32]: handle functions not dealing with LPIs as well
> - [04/32]: new patch to rename gic_remove_from_queues and remove lock
> - [05/32]: new patch to introduce helper for removing IRQs from the VGIC
> - [06/32]: adapt to previous changes
> - [08/32]: use memset to clear the whole structure
> - [09/32]: split off from former 05/28
> - [10/32]: moved up front, initialize VCPU ID
> - [11/32]: remove VCPU ID from host entry, rework LPI injection, moved
>            out part dealing with LPI priority caching
> - [13/32]: add a hint about boolean variables
> - [14/32]: fix bool_t type usage
> - [16/32]: replace magic value, always use intid_bits for TYPER generation,
>            add memory barrier
> - [17/32]: add comments about ITS table layout, remove le64_to_cpu(),
>            simplify CTLR read and remove lock, use atomic read and add comment
>            in CREADR read, add TODO and ASSERT about crafting ITS tables,
>            add empty lines in switch/case, move code block
> - [18/32]: consistent use of data types, comments moved out to earlier patch
> - [19/32]: fold in get_host_lpi(), rename veventid identifier
> - [21/32]: move variable assignment
> - [22/32]: add TODO locking comment, use new gic_remove_irq() function
> - [23/32]: remove no longer needed VCPU ID from host LPI functions, add
>            locking TODO, use goto out
> - [24/32]: explain reason for LR check, make LRs unsigned, move PRISTINE
>            check into one place
> - [25/32]: mention MAPI, remove VCPU ID from host LPI updates, use atomic
>            write for priority update, remove outdated comment, explain
>            error handling path, check for valid property table
> - [26/32]: remove update of VCPU ID in the host_lpi structure, add locking 
> TODO
> - [27/32]: fix error path
> - [28/32]: add comment about physical LPI, use generic function to remove IRQ,
>            remove redundant clear_bit
> - [31/32]: make vgic_v3_its_init_virtual() static (and move it), move comment,
>            remove unneeded call to vgic_v3_its_free_domain()
> 
> Changelog v8 ... v9:
> - [01/28]: initialize number of interrupt IDs for DomUs also
> - [02/28]: move priority reading back up front
> - [03/28]: enumerate all call sites in commit message, add ASSERTs,
>            add "unlikely" hints, avoid skipping ASSERTs, add comment to
>            irq_to_pending() definition
> - [04/28]: explain expectation of device state while destroying domain
> - [05/28]: document case of invalid LPI, change dummy priority to 0xff
> - [08/28]: check cross page boundary condition early in function
> - [10/28]: initialize status and lr member as well
> - [11/28]: check lpi_vcpu_id to cover all virtual CPUs
> - [12/28]: add spin lock ASSERT
> - [13/28]: introduce types for our ITS table entries, fix error messages
> - [14/28]: use new ITS table entry types
> - [15/28]: new patch to introduce pending_irq lookup function
> - [17/28]: verify size of collection table entry
> - [18/28]: use new pending_irq lookup function
> - [19/28]: use new pending_irq lookup function, collection table type and
>            vgic_init_pending_irq, add Dom0 ASSERT and unmap devices for DomUs
> - [20/28]: document PRISTINE_LPI flag, fix typo, avoid double insertion of
>            the same LPI into different LRs
> - [21/28]: use new pending_irq lookup function, avoid explict LPI number
>            parameter
> - [22/28]: add physical affinity TODO, use new table type and pending_irq
>            lookup function, fix error message
> - [24/28]: use pending_irq lookup function, drop explicit LPI number parameter
> - [25/28]: drop explicit LPI number parameter
> - [27/28]: use new ITS table entry type
> 
> Changelog v7 ... v8:
> - drop list parameter and rename to gicv3_its_make_hwdwom_dt_nodes()
> - remove rebase artifacts
> - add irq_enter/irq_exit() calls
> - propagates number of host LPIs and number of event IDs to Dom0
> - add proper coverage of all addresses in ITS MMIO handler
> - avoid vcmd_lock for CBASER writes
> - fix missing irqsave/irqrestore on VGIC VCPU lock
> - move struct pending_irq use under the VGIC VCPU lock
> - protect gic_raise_guest_irq() against NULL pending_irq
> - improve device and collection table entry size documentation
> - count number of ITSes to increase mmio_count
> - rework MAPD, DISCARD, MAPTI and MOVI to take proper locks
> - properly rollback failing MAPD and MAPTI calls
> - rework functions to update property table
> - return error on vgic_access_guest_memory crossing page boundary
> - make sure CREADR access is atomic
> 
> Changelog v5 ... v6:
> - reordered patches to allow splitting the series
> - introduced functions later to avoid warnings on intermediate builds
> - refactored common code changes into separate patches
> - dropped GENMASK_ULL and BIT_ULL (both patches and their usage later)
> - rework locking in MMIO register reads and writes
> - protect new code from being executed without an ITS being configured
> - fix vgic_access_guest_memory (now a separate patch)
> - some more comments and TODOs
> 
> Changelog v4 ... v5:
> - adding many comments
> - spinlock asserts
> - rename r_host_lpis to max_host_lpi_ids
> - remove max_its_device_bits command line
> - add warning on high number of LPIs
> - avoid potential leak on host MAPD
> - properly handle nr_events rounding
> - remove unmap_all_devices(), replace with ASSERT
> - add barriers for (lockless) host LPI lookups
> - add proper locking in ITS and redist MMIO register handling
> - rollback failing device mapping
> - fix various printks
> - add vgic_access_guest_memory() and use it
> - (getting rid of page mapping functions and helpers)
> - drop table mapping / unmapping on redist/ITS enable/disable
> - minor reworks in functions as per review comments
> - fix ITS enablement check
> - move lpi_to_pending() and lpi_get_priority() to vgic_ops
> - move do_LPI() to gic_hw_ops
> - whitespace and hard tabs fixes
> - introduce ITS domain init function (and use it for the rbtree)
> - enable IRQs around do_LPI
> - implement TODOs for later optimizations
> - add "v" prefix to variables holding virtual properties
> - provide locked and normal versions of read/write_itte
> - only CLEAR LPI if not already guest visible (plus comment)
> - update LPI property on MAPTI
> - store vcpu_id in pending_irq for LPIs (helps INVALL)
> - improve INVALL implementation to only cover LPIs on this VCPU
> - improve virtual BASE register initialization
> - limit number of virtual LPIs to 24 bits (Linux bug at 32??)
> - only inject LPIs if redistributor is actually enabled
> 
> Changelog v3 .. v4:
> - make HAS_ITS depend on EXPERT
> - introduce new patch 02 to initialize host ITS early
> - fix cmd_lock init position
> - introduce warning on high number of LPI allocations
> - various int -> unsigned fixes
> - adding and improving comments
> - rate limit ITS command queue full msg
> - drop unneeded checks
> - validate against allowed number of device IDs
> - avoid memory leaks when removing devices
> - improve algorithm for finding free host LPI
> - convert unmap_all_devices from goto to while loop
> - add message on remapping ITS device
> - name virtual device / event IDs properly
> - use atomic read when reading ITT entry
> 
> Changelog v2 .. v3:
> - preallocate struct pending_irq's
> - map ITS and redistributor tables only on demand
> - store property, enable and pending bit in struct pending_irq
> - improve error checking and handling
> - add comments
> 
> Changelog v1 .. v2:
> - clean up header file inclusion
> - rework host ITS table allocation: observe attributes, many fixes
> - remove patch 1 to export __flush_dcache_area, use existing function instead
> - use number of LPIs internally instead of number of bits
> - keep host_its_list as private as possible
> - keep struct its_devices private
> - rework gicv3_its_map_guest_devices
> - fix rbtree issues
> - more error handling and propagation
> - cope with GICv4 implementations (but no virtual LPI features!)
> - abstract host and guest ITSes by using doorbell addresses
> - join per-redistributor variables into one per-CPU structure
> - fix data types (unsigned int)
> - many minor bug fixes
> 
> (Rough) changelog RFC-v2 .. v1:
> - split host ITS driver into gic-v3-lpi.c and gic-v3-its.c part
> - rename virtual ITS driver file to vgic-v3-its.c
> - use macros and named constants for all magic numbers
> - use atomic accessors for accessing the host LPI data
> - remove leftovers from connecting virtual and host ITSes
> - bail out if host ITS is disabled in the DT
> - rework map/unmap_guest_pages():
>     - split off p2m part as get/put_guest_pages (to be done on allocation)
>     - get rid of vmap, using map_domain_page() instead
> - delay allocation of virtual tables until actual LPI/ITS enablement
> - properly size both virtual and physical tables upon allocation
> - fix put_domain() locking issues in physdev_op and LPI handling code
> - add and extend comments in various areas
> - fix lotsa coding style and white space issues, including comment style
> - add locking to data structures not yet covered
> - fix various locking issues
> - use an rbtree to deal with ITS devices (instead of a list)
> - properly handle memory attributes for ITS tables
> - handle cacheable/non-cacheable ITS table mappings
> - sanitize guest provided ITS/LPI table attributes
> - fix breakage on non-GICv2 compatible host GICv3 controllers
> - add command line parameters on top of Kconfig options
> - properly wait for an ITS to become quiescient before enabling it
> - handle host ITS command queue errors
> - actually wait for host ITS command completion (READR==WRITER)
> - fix ARM32 compilation
> - various patch splits and reorderings
> 
> Andre Przywara (33):
>   ARM: vGIC: avoid rank lock when reading priority
>   ARM: GICv3: enable ITS on the host
>   ARM: GICv3: enable LPIs on the host
>   ARM: GICv3: setup number of LPI bits for a GICv3 guest
>   ARM: vGIC: rework gic_remove_from_queues()
>   ARM: vGIC: move irq_to_pending() calls under the VGIC VCPU lock
>   ARM: vGIC: introduce gic_remove_irq_from_queues()
>   ARM: GIC: Add checks for NULL pointer pending_irq's
>   ARM: GICv3: introduce separate pending_irq structs for LPIs
>   ARM: GIC: export and extend vgic_init_pending_irq()
>   ARM: vGIC: cache virtual LPI priority in struct pending_irq
>   ARM: vGIC: add LPI VCPU ID to struct pending_irq
>   ARM: GIC: ITS: remove no longer needed VCPU ID in host LPI entry
>   ARM: GICv3: forward pending LPIs to guests
>   ARM: vGICv3: handle virtual LPI pending and property tables
>   ARM: vGICv3: re-use vgic_reg64_check_access
>   ARM: vGIC: advertise LPI support
>   ARM: vITS: add command handling stub and MMIO emulation
>   ARM: vITS: introduce translation table walks
>   ARM: vITS: provide access to struct pending_irq
>   ARM: vITS: handle INT command
>   ARM: vITS: handle MAPC command
>   ARM: vITS: handle CLEAR command
>   ARM: vITS: handle MAPD command
>   ARM: GICv3: handle unmapped LPIs
>   ARM: vITS: handle MAPTI/MAPI command
>   ARM: vITS: handle MOVI command
>   ARM: vITS: handle DISCARD command
>   ARM: vITS: handle INV command
>   ARM: vITS: handle INVALL command
>   ARM: vITS: increase mmio_count for each ITS
>   ARM: vITS: create and initialize virtual ITSes for Dom0
>   ARM: vITS: create ITS subnodes for Dom0 DT
> 
> Vijaya Kumar K (1):
>   ARM: introduce vgic_access_guest_memory()
> 
>  xen/arch/arm/gic-v2.c            |    7 +
>  xen/arch/arm/gic-v3-its.c        |  180 +++++
>  xen/arch/arm/gic-v3-lpi.c        |   99 ++-
>  xen/arch/arm/gic-v3.c            |   29 +-
>  xen/arch/arm/gic.c               |  102 ++-
>  xen/arch/arm/vgic-v2.c           |   28 +-
>  xen/arch/arm/vgic-v3-its.c       | 1481 
> +++++++++++++++++++++++++++++++++++++-
>  xen/arch/arm/vgic-v3.c           |  327 ++++++++-
>  xen/arch/arm/vgic.c              |  145 +++-
>  xen/include/asm-arm/domain.h     |   16 +-
>  xen/include/asm-arm/event.h      |    3 +
>  xen/include/asm-arm/gic.h        |    5 +-
>  xen/include/asm-arm/gic_v3_its.h |   44 ++
>  xen/include/asm-arm/vgic-emul.h  |    9 +
>  xen/include/asm-arm/vgic.h       |   23 +-
>  15 files changed, 2416 insertions(+), 82 deletions(-)
> 
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.