[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] ARM PCI Passthrough design document




> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of
> Stefano Stabellini
> Sent: Friday, July 7, 2017 4:50 PM
> To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Cc: edgar.iglesias@xxxxxxxxxx; 'Stefano Stabellini' <sstabellini@xxxxxxxxxx>;
> Vikram Sethi <vikrams@xxxxxxxxxxxxxx>; 'Wei Chen' <Wei.Chen@xxxxxxx>;
> 'Steve Capper' <Steve.Capper@xxxxxxx>; 'Andre Przywara'
> <andre.przywara@xxxxxxx>; manish.jaggi@xxxxxxxxxxxxxxxxxx; 'Julien
> Grall' <julien.grall@xxxxxxxxxx>; 'Vikram Sethi' <vikrams@xxxxxxxxxxxxxxxx>;
> punit.agrawal@xxxxxxx; 'Sameer Goel' <sgoel@xxxxxxxxxxxxxxxx>; 'xen-
> devel' <xen-devel@xxxxxxxxxxxxxxxxxxxx>; 'Sinan Kaya'
> <okaya@xxxxxxxxxxxxxxxx>; 'Dave P Martin' <Dave.Martin@xxxxxxx>;
> 'Vijaya Kumar K' <Vijaya.Kumar@xxxxxxxxxxxxxxxxxx>
> Subject: Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
>
> On Fri, 7 Jul 2017, Roger Pau Monné wrote:
> > On Thu, Jul 06, 2017 at 03:55:28PM -0500, Vikram Sethi wrote:
> > > > > > AER: Will PCIe non-fatal and fatal errors (secondary bus reset
> > > > > > for fatal) be
> > > > recoverable in Xen?
> > > > > > Will drivers in doms be notified about fatal errors so they
> > > > > > can be quiesced
> > > > before doing secondary bus reset in Xen?
> > > > > > Will Xen support Firmware First Error handling for AER? i.e
> > > > > > When platform does Firmware first error handling for AER
> > > > > > and/or filtering of AER,
> > > > sends associated ACPI HEST logs to Xen How will AER notification
> > > > and logs be propagated to the doms: injected ACPI HEST?
> > > >
> > > > Hm, I'm not sure I follow here, I don't see AER tied to ACPI. AER
> > > > is a PCIe capability, and according to the spec can be setup
> > > > completely independent to ACPI.
> > > >
> > > True, it can be independent if not using firmware first AER handling
> > > (FFH). But Firmware tells the OS whether firmware first is in use.
> > > If FFH is in use, the AER interrupt goes to firmware and then
> > > firmware processes
> >
> > I'm sorry, but how is the firmware supposed to know which interrupt is
> > AER using? That's AFAIK setup in the PCI AER capabilities, and depends
> > on whether the OS configures the device to use MSI or MSI-X.
> >
> > Is there some kind of side-band mechanism that delivers the AER
> > interrupt using a different method?
> >
The AER interrupt is not generated by the device that sends the "AER message" 
to 
root port, it is from the root port aka "event collector" itself. i.e the 
endpoint/adapter sends an AER message to root port and root port sends 
interrupt 
to CPU
Firmware should just KNOW what the IRQ number for the root port is for AER when 
it is doing firmware first error handling (assuming the Root port generated a 
wired interrupt for AER).

The other part to this is, how do Firmware and OS exchange what is the 
event/interrupt number when FW sends the AER HEST log to this OS. This comes 
from ACPI GHES.
See 
http://elixir.free-electrons.com/linux/latest/source/drivers/acpi/apei/ghes.c#L954
There can be many possibilities such as SCI, IRQ/GSIV, GPIO event etc

> > > the AER logs, filters errors, and sends a ACPI HEST log with the
> > > filtered AER regs to OS along with an ACPI event/interrupt. Kernel
> > > is not supposed to touch the AER registers directly in this case,
> > > but act on the register values in the HEST log.
> > > http://elixir.free-electrons.com/linux/latest/source/drivers/pci/pci
> > > e/aer/aerdrv_acpi.c#L94
> >
> > That's not a problem IMHO, Xen could even mask the AER capability from
> > the Dom0/guest completely if needed.
> >
> > > If Firmware is using FFH, Xen will get a HEST log with AER
> > > registers, and must parse those registers instead of reading AER config
> space.
> >
> > Xen will not get an event, it's going to be delivered to Dom0 because
> > when using ACPI Dom0 is the OSPM (not Xen). I assume this event is
> > going to be notified by triggering an interrupt from the ACPI SCI?
>

See above. It is obtained from GHES and can be SCI, GSIV, GPIO signal etc.

> It is still possible to get the event in Xen, either by having Dom0 tell Xen 
> about
> it, or my moving ACPI SCI handling in Xen. If we move ACPI SCI handling in 
> Xen,
> we could still forward a virtual SCI interrupt to Dom0 in cases where Xen
> decides that Dom0 should be the one handling the event. In other cases,
> where Xen knows how to handle the event, then nothing would be sent to
> Dom0. Would that work?
>

It could work for GSIV/irq or SCI. But one of the possibilities is a ACPI 6.1 
GED interrupt (GED= generic event device, yes there are way too many acronyms 
in 
ACPI :) ) and this requires ASL to be run, so would need dom0.
See https://patchwork.kernel.org/patch/8115901/>
> > > After the AER registers have been parsed (either from HEST log or
> > > native Xen AER interrupt handler), at least for fatal errors, Xen
> > > needs to send notification to the DOM with the device passthrough so
> > > that it's driver(s) can be quiesced (via callbacks to
> > > dev->driver->err_handler->error_detected for linux) before hot
> reset/secondary bus reset.
> >
> > I don't think this is relevant/true given the statement above (Dom0
> > being OSPM and receiving the event).
> >

Sure, if dom0 gets the AER interrupt or ACPI "event" for FFH, then there is no 
need to forward anything.

> > > Whether FFH is in use or not, Xen has 2 choices in how to present
> > > the error to doms for quiescing before secondary bus reset:
> >
> > How is this secondary bus reset performed?
>
> It is based on writing to PCI config space registers
> (drivers/pci/pci.c:pci_reset_secondary_bus). If Xen is in charge of ECAM, it
> shouldn't be an issue for Xen to do it.
>
>
> > Is it something specific to each bridge or it's a standard interface?
> >
> > Can it be done directly by Dom0, or should it be done by Xen?
> >

Triggering the Secondary bus reset is straightforward, It is a PCI defined bit 
(SBR) in root port Bridge control register.
It could be done in either Xen or dom0 but probably makes sense to do it where 
the config cycles are being "controlled" and by whoever is doing the PCI probe.
I had misunderstood Julien's design to mean PCI probing was being done by Xen, 
but on 2nd read he's saying dom0/hw domain does the PCI probe and notifies Xen 
of the config.

BTW this does raise the question of who reads the Root port Access Control 
Services config space capability to decide what is the safest "unit" of 
assignment to doms: Xen or dom0?
Clearly Xen should be the one deciding if root port (and any switches if 
present) supports ACS upstream forwarding etc and if it is safe to assign just 
a 
function/VF or if entire PCI tree under root port is the "minimum" assignable 
entity.
For background see 
http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html
So is Xen issuing ECAM based config cycles to root port config space without 
serialization with dom0 which also can issue root port and downstream config 
accesses? I'm not sure if this can be an issue or not.
The other thing that I haven't fully processed yet is when dom0 sends 
information piece-meal to Xen with "here's a root port with SBDF1, here's some 
device with SBDF2", can Xen accurately reconstruct the entire PCI tree?
This is important because there could be PCIe switches under the root port 
which 
may or may not support ACS, so Xen has to know where in the tree the "min safe 
assignment" unit is.

> > > a. Send a HEST log and ACPI interrupt/event to dom if it booted ACPI
> > > dom and linux dom calls aer_recover_queue from ACPI ghes path
> > > http://elixir.free-electrons.com/linux/latest/source/drivers/pci/pci
> > > e/aer/aerdrv_core.c#L592b. Present a Root port wired interrupt
> > > source in dom ACPI/DT, and inject that irq in the GIC LR registers.
> > > When dom kernel processes the interrupt and queries
> >
> > You lost me here, I have no knowledge of ARM, and I don't know what
> > GIC LR is at all.
>
> GIC LRs are registers specific to the ARM Generic Interrupt Controller that 
> allow
> an hypervisor to inject interrupts into a guest.  Vikram is saying that the 
> irq
> could be injected into the guest.
>
>
> > > config space AER, Xen emulates the AER values it wants the dom to
> > > see (in FFH case based on register values in HEST), and if FFH was
> > > in use, not actually allow the dom to clear out the AER registers.
> > >
> > > Option b is probably better/easier since it works for ACPI/DT dom.
> >
> > So as I understand it, the flow is the following:
> >
> > 1. Hardware generates an error.
> > 2. This error triggers an interrupt that's delivered to Dom0 (either
> >    using an ACPI SCI or a specific AER MSI vector) 3. *Someone* has to
> > do a secondary bus reset.
> >
> > My question would be, who (either Xen or Dom0) should perform the bus
> > reset? (and why).
>
> I am interested in Vikram's reply, he knows more than me about this.
> However, my gut feeling is that it's best to do it in Xen because otherwise 
> Xen
> might end up having to wait for Dom0 for the completion of the reset. The
> operation is now short and it includes a couple of
> sleeps: each sleep is an opportunity to trap into Xen again and risk
> descheduling the Dom0 vcpu.
>

Linux dom0 will attempt a SecBusReset config access to root port anyway for 
fatal errors. I earlier misunderstood that Xen would be trapping and issuing 
all 
config cycles, but since dom0 is controlling the config space access, might as 
well let the dom issued SBR go through.
Yes, there is a write to assert the SBR in bridge control wait a msec and 
another one to deassert.
Is your concern that if dom0 gets descheduled between the assert and deassert 
that the recovery is delayed? Yes, that is true, but it should be tolerable I 
think. Since devices are getting reset, and drivers reinitialized, there will 
always be a hiccup/gap/some temporary disruption.

>
> > > In my view this is the basic AER error handling leaving the devices
> > > inaccessible.
> > > To recover/resume the devices, the owning dom would need to signal
> > > Xen once all its driver(s) have quiesced, letting Xen know it is ok
> > > to do the secondary bus reset (for AER fatal errors). The best way
> > > to signal this would be to let the dom try to hit SBR in the Root
> > > port bridge control register in config space, and Xen traps that and 
> > > actually
> does the BCR.SBR write.
> > >
> > > Since Xen controls the ECAM config space access in Julien's proposed
> > > design, I don't see any fundamental issues with the above flow fitting 
> > > into
> the design.
> >
> > I think it's very hard for me (or Julien) to know exactly how all the
> > PCI capabilities behave and interact with other components (like
> > ACPI).
> >
> > You seem to have a good amount of knowledge about this stuff, would
> > you mind writing your proposal as a diff to Julien's original
> > proposal, so that it can be properly reviewed and merged into the
> > design document?

Thanks,
Vikram
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, 
Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.