[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen on ARM vITS Handling Draft B (Was Re: Xen/arm: Virtual ITS command queue handling)



Hi Vijay,

On 22/05/2015 13:16, Vijay Kilari wrote:
On Tue, May 19, 2015 at 7:21 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
On Tue, 2015-05-19 at 14:37 +0100, Julien Grall wrote:
Hi Ian,

On 19/05/15 13:10, Ian Campbell wrote:
On Fri, 2015-05-15 at 15:55 +0100, Julien Grall wrote:
[...]
Translation of certain commands can be expensive (XXX citation
needed).

The term "expensive" is subjective. I think we can end up to cheap
translation if we properly pre-allocate information (such as device,
LPIs...). We can have all the informations before the guest as boot or
during hotplug part. It wouldn't take more memory than it should use.

During command translation, we would just need to enable the device/LPIs.

The remaining expensive part would be the validation. I think we can
improve most of them of O(1) (such as collection checking) or O(log(n))
(such as device checking).
[...]
XXX need a solution for this.

Command translation can be improved. It may be good too add a section
explaining how translation of command foo can be done.

I think that is covered by the spec, however if there are operations
which form part of this which are potentially expensive we should
outline in our design how this will be dealt with.

Perhaps you or Vijay could propose some additional text covering:
       * What the potentially expensive operations during a translation
         are.
       * How we are going to deal with those operations, including:
               * What data structure is used
               * What start of day setup is required to enable this
               * What operations are therefore required at translation
                 time

I don't have much time to work on a proposal. I would be happy if Vijay
do it.

OK, Vijay could you make a proposal here please.

__text__

I gave a second look to your proposal.

1) Command translation:
-----------------------------------

  - ITS commands contains device ID, Event ID (vID), Collection ID
(vCID), Target Address (vTA)
     parameters
  - All these parameters should be validated
  - These parameters should be translated from Virtual to Physical

Of the existing GICv3 ITS commands, MAPC, MAPD, MAPVI/MAPI are the time
consuming commands as these commands creates entry in the Xen ITS structures,
which are used to validate other ITS commands.

1.1 MAPC command translation
-----------------------------------------------
    Format: MAPC vCID, vTA

    -  vTA is validated against Re-distributor address by searching
Redistributor region /
        CPU number based on GITS_TYPER.PAtype and Physical Collection
ID & Physical
        Target address are retrieved
    -  Each vITS will have cid_map (struct cid_mapping) which holds mapping of
       Virtual Collection ID, Virtual Targets address and Physical Collection 
ID.
    -  MAPC pCID, pTA physical ITS command is generated

    Here there is no overhead, the cid_map entries (approx 32 entries)
are preallocated when
    vITS is created.

How did you decide the 32 entries? The ITS must at least provide N + 1 collection when N is the number of processors.

Also, how do you handle collection re-mapping?



1.2 MAPD Command translation:
-----------------------------------------------
    Format: MAPD device, ITT IPA, ITT Size

    MAPD is sent with Validation bit set if device needs to be added
and reset when device is removed

If Validation bit is set:
    - Allocate memory for its_device struct
    - Validate ITT IPA & ITT size and update its_device struct
    - Find number of vectors(nrvecs) for this device by querying PCI
helper function
    - Allocate nrvecs number of LPI
    - Allocate memory for struct vlpi_map for this device. This
vlpi_map holds mapping
      of Virtual LPI to Physical LPI and ID.
    - Find physical ITS node for which this device is assigned

    - Call p2m_lookup on ITT IPA addr and get physical ITT address
    - Validate ITT Size
    - Generate/format physical ITS command: MAPD, ITT PA, ITT Size

    Here the overhead is with memory allocation for its_device and vlpi_map

What about device remapping?

If Validation bit is not set:
     - Validate if the device exits by checking vITS device list
     - Clear all vlpis assigned for this device
     - Remove this device from vITS list
     - Free memory

1.3 MAPVI/MAPI Command translation:
-----------------------------------------------
    Format: MAPVI device, ID, vID, vCID

- Validate if the device exits by checking vITS device list
- Validate vCID and get pCID by searching cid_map
- if vID does not have entry in vlpi_entries of this device
   If not, Allot pID from vlpi_map of this device and update
vlpi_entries with new pID
- Allocate irq descriptor and add to RB tree
- call route_irq_to_guest() for this pID
- Generate/format physical ITS command: MAPVI device ID, pID, pCID

Here the overhead is allot physical ID, allocate memory for
irq descriptor and  routing interrupt

All other ITS command like MOVI, DISCARD, INV, INVALL, INT, CLEAR,
SYNC just validate and generate physical command

Interrupt remapping?

__text__

We can discuss and add how to reduce translation time.

I wrote my though for the validation bits (see below) and add some definitions useful for people which don't have the spec.

Emulation of ITS commands
=========================

# Introduction

This document is based on the section 5.13 of GICv3 specification
(PRD03-GENC-010745 24.0). The goal is to provide insight of the cost
to emulate ITS commands in Xen.

The ITS provides 12 commands in order to manage interrupt collection,
device and interrupts.

# Definitions

## Device identifier

Each device using the ITS is associated to an unique identifier. It's
discoverable via the firwmare and a specific algorithm (not described here).

The number of identifiers is variable and can be discovered via
GITS_TYPER.Devbits. The field allow this ITS to have up to 2^32 device.

## Collection

Each interrupt is a member of an Interrupt Collection. This allows software to manage large numbers of physical interrupts with a small number of commands rather than issuing command per interrupt.

On a system with N processors, the ITS must provide at least N+1 collections.

## Target Addresses

The Target Address correspond to a specific re-distributor. The format of this field depend on the value of the bit GITS_TYPER.PTA:
    - 1: the base address of the re-distributor target is used
    - 0: a unique processor number is used. The mapping between the
    processor affinity value (MPIDR) and the processor number can be
    discoverable via GICR_TYPER.ProcessorNumber.

# Validation of the parameters

Each command contains parameters that needs to be validated before any usage in Xen or passing to the hardware.

This section will describe the validation of the main parameters.

## Device ID

This parameter is used in commands which manage the device and the interrupts associated to this device. Checking if a device is present and retrieving the data structure must be fast.

The device identifiers may not be assigned contiguously and the maximum number is very high (2^32). The possible efficient data structure would be: 1) List: The lookup/deletion is in O(n) and the insertion will depend if the device should be sort following their identifier. The memory overhead is 18 bytes per element. 2) Red-black tree: All the operations are O(log(n)). The memory overhead is 24 bytes per element.

The solution 2) seems the more suitable for having fast deviceID validation even though the memory overhead is a bit higher compare to the list.

## Collection

This parameter is used in commands which manage collections and interrupt in order to move them for one CPU to another. The ITS is only mandatory to implement N + 1 collections where N is the number of processor on the platform. Furthermore, the identifier are always contiguous.

If we decide to implement the strict minimum (i.e N + 1), an array is
enough and will allow operations in O(1).

## Target Address

This parameter is used in commands to manage collection. It's a unique
identifier per processor. The format is different following the value
of the bit GITS_TYPER.PTA (see definition). The value of the field pre-defined by the ITS and the software has to handle the 2 cases.

The solution with GITS_TYPER.PTA set to one will require some computation
in order to find the VCPU associated with the redistributor address. It will be similar to get_vcpu_from_rdist in the vGICv3 emulation (xen/arch/arm/vgic-v3.c).

On another hand, setting GITS_TYPER.PTA to zero will give us control to
decide the linear process number  which could simply be the vcpu_id (always
linear).

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.