|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v8 02/13] xen/arm: gic-v2: Implement GIC suspend/resume functions
On Fri, May 8, 2026 at 1:57 PM Luca Fancellu <Luca.Fancellu@xxxxxxx> wrote:
>
> Hi Mykola,
>
> >
> >>
> >>> + }
> >>> +
> >>> + off = i * sizeof(irqs->icfgr);
> >>> + for ( j = 0; j < ARRAY_SIZE(irqs->icfgr); j++ )
> >>> + writel_gicd(irqs->icfgr[j], GICD_ICFGR + off + j * 4);
> >>> + }
> >>> +
> >>> + /* Make sure all registers are restored and enable distributor */
> >>> + writel_gicd(gic_ctx.dist.ctlr, GICD_CTLR);
> >>> +
> >>> + /* Restore GIC CPU interface configuration */
> >>> + writel_gicc(gic_ctx.cpu.pmr, GICC_PMR);
> >>> + writel_gicc(gic_ctx.cpu.bpr, GICC_BPR);
> >>> +
> >>> + /* Enable GIC CPU interface */
> >>> + writel_gicc(gic_ctx.cpu.ctlr, GICC_CTLR);
> >>> +}
> >>> +
> >>
> >> I also see that we don’t save pending SGIs state (by
> >> GICD_CPENDSGIRn/GICD_SPENDSGIRn) or Active Priorities registers
> >> state (GICC_APRn/GICC_NSAPRn [latter if security extension are there]) as
> >> written in [1] “4.5 Preserving and restoring GIC state”,
> >> was it intentional?
> >
> > Yes, this was intentional.
> >
> > The GICv2 suspend callback is called at a quiescent point in the
> > SYSTEM_SUSPEND path: all domains are already shut down for suspend, guest
> > execution is quiesced, the scheduler is disabled, non-boot CPUs have been
> > offlined, and CPU0 enters gic_suspend() with local interrupts disabled.
> >
> > For SGIs, I don't consider GICD_CPENDSGIRn/GICD_SPENDSGIRn part of the saved
> > host GIC context. Xen uses physical SGIs as IPIs, and IPI delivery is an
> > internal synchronization mechanism, not architectural state that should be
> > replayed after SYSTEM_SUSPEND. Guest SGI state is virtual GIC state and is
> > not
> > represented by these physical GICD SGI pending registers.
>
> ack, I would maybe mention in the commit message that we exclude transient
> IPI/active-priority
> state at the suspend quiescent point.
Ack.
>
> >
> > For GICC_APRn/GICC_NSAPRn, those registers describe active priority state
> > for
> > interrupts already acknowledged by the CPU interface. The final suspend
> > path is
> > not expected to run with an active physical interrupt context. If those
> > registers were non-zero there, restoring only APR/NSAPR would not make the
> > corresponding interrupt handling context valid after resume, and could
> > instead
> > leave the CPU interface with stale active priority state.
>
> Ok I understand now, but if we are expecting here GICD_ISACTIVERn zeroed, why
> are
> we saving/restoring it? Shouldn’t we instead have a runtime check that it’s
> zero and in case
> it’s not bail out? And in the resume path we would only zero it.
>
> Am I missing something?
Good questions.
Yes, the distinction I should have made clearer is between CPU-interface
active-priority state and distributor active state.
For GICC_APRn/GICC_NSAPRn, I expect the state to be quiesced at this point.
Those registers track active priorities in the CPU interface. Xen reaches
gic_suspend() with local interrupts disabled, and for the guest-routed
interrupt case that can leave a distributor active bit behind, Xen has
already performed the physical EOI, so the CPU-interface priority has been
dropped.
There is no CPU-interface active-priority context that we can meaningfully
replay after resume.
That is different from GICD_ISACTIVERn. In EOImode==1, EOIR only drops the
priority. The interrupt remains active in the distributor until the separate
deactivation step. For a guest-routed interrupt Xen's GICv2 guest end path does
only the physical EOI; deactivation is completed later by the virtual GIC/GICV
path when the guest completes the interrupt.
This is why APR/NSAPR and ISACTIVERn are treated differently. For example:
1. A physical IRQ routed to a guest is acknowledged by Xen.
2. The GIC marks the interrupt active in the distributor.
3. Xen EOIs it, which drops the physical priority.
4. Xen queues/injects the interrupt to the vGIC.
5. The guest has not yet run, or the virtual interrupt is not yet deliverable
because of guest PMR/priority/local IRQ masking/vGIC state.
6. Therefore the guest-side deactivate has not happened yet, and the physical
distributor active bit remains set.
There is also a late suspend window in the current Xen path: domains are
suspended and the scheduler is disabled before local IRQs are disabled.
A guest-routed IRQ can therefore be taken by Xen after the guest is already
suspended, but before gic_suspend(). Xen can EOI/priority-drop it and queue
it for the guest, while the guest cannot run and deactivate it before the
GIC state is saved.
This is the same class of issue handled by Linux for GIC EOImode==1. Linux
saves/restores the active state because forwarded interrupts can remain active
while passed to a VM [1].
So I don't think GICD_ISACTIVERn should be treated as "must be zero" unless we
also add an explicit suspend-abort/quiesce policy for in-flight guest
interrupts. That would be a different design: detect non-zero active/in-flight
state, unwind suspend, thaw domains, let the guest drain/deactivate the
interrupts, and retry later. This series does not implement that policy. Given
the current flow, preserving GICD_ISACTIVERn avoids losing architectural
interrupt-controller state across suspend/resume.
I am not opposed to such a policy as a follow-up if we want stricter suspend
quiescence rules, but I think it should be designed explicitly rather than
inferred from the GIC save/restore code.
Best regards,
Mykola
[1]
https://patchwork.kernel.org/project/linux-arm-kernel/patch/1447701208-18150-5-git-send-email-marc.zyngier@xxxxxxx/
>
> >
> > So I did not add save/restore for GICD_CPENDSGIRn/GICD_SPENDSGIRn or
> > GICC_APRn/GICC_NSAPRn in this patch. I can add a short comment in v9 to make
> > this scope explicit.
> >
> > Please let me know if you think there is a suspend/resume path where this
> > state still needs to be preserved.
> >
> > Best regards,
> > Mykola
>
> Cheers,
> Luca
>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |