|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN PATCH v5 7/7] xen/arm: ffa: support notification
Hi Julien,
On Mon, Jun 3, 2024 at 11:12 AM Julien Grall <julien@xxxxxxx> wrote:
>
> Hi Jens,
>
> On 03/06/2024 10:01, Jens Wiklander wrote:
> > On Fri, May 31, 2024 at 4:28 PM Bertrand Marquis
> > <Bertrand.Marquis@xxxxxxx> wrote:
> >>
> >> Hi Jens,
> >>
> >>> On 29 May 2024, at 09:25, Jens Wiklander <jens.wiklander@xxxxxxxxxx>
> >>> wrote:
> >>>
> >>> Add support for FF-A notifications, currently limited to an SP (Secure
> >>> Partition) sending an asynchronous notification to a guest.
> >>>
> >>> Guests and Xen itself are made aware of pending notifications with an
> >>> interrupt. The interrupt handler triggers a tasklet to retrieve the
> >>> notifications using the FF-A ABI and deliver them to their destinations.
> >>>
> >>> Update ffa_partinfo_domain_init() to return error code like
> >>> ffa_notif_domain_init().
> >>>
> >>> Signed-off-by: Jens Wiklander <jens.wiklander@xxxxxxxxxx>
> >>> ---
> >>> v4->v5:
> >>> - Move the freeing of d->arch.tee to the new TEE mediator free_domain_ctx
> >>> callback to make the context accessible during rcu_lock_domain_by_id()
> >>> from a tasklet
> >>> - Initialize interrupt handlers for secondary CPUs from the new TEE
> >>> mediator
> >>> init_interrupt() callback
> >>> - Restore the ffa_probe() from v3, that is, remove the
> >>> presmp_initcall(ffa_init) approach and use ffa_probe() as usual now
> >>> that we
> >>> have the init_interrupt callback.
> >>> - A tasklet is added to handle the Schedule Receiver interrupt. The
> >>> tasklet
> >>> finds each relevant domain with rcu_lock_domain_by_id() which now is
> >>> enough
> >>> to guarantee that the FF-A context can be accessed.
> >>> - The notification interrupt handler only schedules the notification
> >>> tasklet mentioned above
> >>>
> >>> v3->v4:
> >>> - Add another note on FF-A limitations
> >>> - Clear secure_pending in ffa_handle_notification_get() if both SP and SPM
> >>> bitmaps are retrieved
> >>> - ASSERT that ffa_rcu_lock_domain_by_vm_id() isn't passed the vm_id FF-A
> >>> uses for Xen itself
> >>> - Replace the get_domain_by_id() call done via ffa_get_domain_by_vm_id()
> >>> in
> >>> notif_irq_handler() with a call to rcu_lock_live_remote_domain_by_id()
> >>> via
> >>> ffa_rcu_lock_domain_by_vm_id()
> >>> - Remove spinlock in struct ffa_ctx_notif and use atomic functions as
> >>> needed
> >>> to access and update the secure_pending field
> >>> - In notif_irq_handler(), look for the first online CPU instead of
> >>> assuming
> >>> that the first CPU is online
> >>> - Initialize FF-A via presmp_initcall() before the other CPUs are online,
> >>> use register_cpu_notifier() to install the interrupt handler
> >>> notif_irq_handler()
> >>> - Update commit message to reflect recent updates
> >>>
> >>> v2->v3:
> >>> - Add a GUEST_ prefix and move FFA_NOTIF_PEND_INTR_ID and
> >>> FFA_SCHEDULE_RECV_INTR_ID to public/arch-arm.h
> >>> - Register the Xen SRI handler on each CPU using on_selected_cpus() and
> >>> setup_irq()
> >>> - Check that the SGI ID retrieved with FFA_FEATURE_SCHEDULE_RECV_INTR
> >>> doesn't conflict with static SGI handlers
> >>>
> >>> v1->v2:
> >>> - Addressing review comments
> >>> - Change ffa_handle_notification_{bind,unbind,set}() to take struct
> >>> cpu_user_regs *regs as argument.
> >>> - Update ffa_partinfo_domain_init() and ffa_notif_domain_init() to return
> >>> an error code.
> >>> - Fixing a bug in handle_features() for FFA_FEATURE_SCHEDULE_RECV_INTR.
> >>> ---
> >>> xen/arch/arm/tee/Makefile | 1 +
> >>> xen/arch/arm/tee/ffa.c | 72 +++++-
> >>> xen/arch/arm/tee/ffa_notif.c | 409 ++++++++++++++++++++++++++++++++
> >>> xen/arch/arm/tee/ffa_partinfo.c | 9 +-
> >>> xen/arch/arm/tee/ffa_private.h | 56 ++++-
> >>> xen/arch/arm/tee/tee.c | 2 +-
> >>> xen/include/public/arch-arm.h | 14 ++
> >>> 7 files changed, 548 insertions(+), 15 deletions(-)
> >>> create mode 100644 xen/arch/arm/tee/ffa_notif.c
> >>>
> > [...]
> >>>
> >>> @@ -517,8 +567,10 @@ err_rxtx_destroy:
> >>> static const struct tee_mediator_ops ffa_ops =
> >>> {
> >>> .probe = ffa_probe,
> >>> + .init_interrupt = ffa_notif_init_interrupt,
> >>
> >> With the previous change on init secondary i would suggest to
> >> have a ffa_init_secondary here actually calling the
> >> ffa_notif_init_interrupt.
> >
> > Yes, that makes sense. I'll update.
> >
> >>
> >>> .domain_init = ffa_domain_init,
> >>> .domain_teardown = ffa_domain_teardown,
> >>> + .free_domain_ctx = ffa_free_domain_ctx,
> >>> .relinquish_resources = ffa_relinquish_resources,
> >>> .handle_call = ffa_handle_call,
> >>> };
> >>> diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c
> >>> new file mode 100644
> >>> index 000000000000..e8e8b62590b3
> >>> --- /dev/null
> >>> +++ b/xen/arch/arm/tee/ffa_notif.c
> > [...]
> >>> +static void notif_vm_pend_intr(uint16_t vm_id)
> >>> +{
> >>> + struct ffa_ctx *ctx;
> >>> + struct domain *d;
> >>> + struct vcpu *v;
> >>> +
> >>> + /*
> >>> + * vm_id == 0 means a notifications pending for Xen itself, but
> >>> + * we don't support that yet.
> >>> + */
> >>> + if ( !vm_id )
> >>> + return;
> >>> +
> >>> + d = ffa_rcu_lock_domain_by_vm_id(vm_id);
> >>> + if ( !d )
> >>> + return;
> >>> +
> >>> + ctx = d->arch.tee;
> >>> + if ( !ctx )
> >>> + goto out_unlock;
> >>
> >> In both previous cases you are silently ignoring an interrupt
> >> due to an internal error.
> >> Is this something that we should trace ? maybe just debug ?
> >>
> >> Could you add a comment to explain why this could happen
> >> (when possible) or not and the possible side effects ?
> >>
> >> The second one would be a notification for a domain without
> >> FF-A enabled which is ok but i am a bit more wondering on
> >> the first one.
> >
> > The SPMC must be out of sync in both cases. I've been looking for a
> > window where that can happen, but I can't find any. SPMC is called
> > with FFA_NOTIFICATION_BITMAP_DESTROY during domain teardown so the
> > SPMC shouldn't try to deliver any notifications after that.
>
> I don't think I agree with the conclusion. I believe, this can also
> happen in normal operation.
>
> For example, the SPMC could have trigger the interrupt before
> FFA_NOTIFICATION_BITMAP_DESTROY but Xen didn't handle the interrupt (or
> run the tasklet) until later.
You're right, there is a window. Delayed handling is OK since
FFA_NOTIFICATION_INFO_GET_64 is invoked from the tasklet, but there is
a window if the tasklet is suspended or another core destroys the
domain before the tasklet has called ffa_rcu_lock_domain_by_vm_id().
So far it's harmless and I guess we can afford a print.
>
> This could be at the time where the domain has been fully destroyed or
> even when...
>
> > In the second case, the domain ID might have been reused for a domain
> > without FF-A enabled, but the SPMC should have known that already.
>
> ... a new domain has been created. Although, the latter is rather unlikely.
>
> So what if the new domain has FFA enabled? Is there any potential
> security issue?
In this case, we'll inject an NPI in the guest, but when it invokes
FFA_NOTIFICATION_GET it will get accurate information from the SPMC.
The worst case is a spurious NPI. This shouldn't be a security issue.
Thanks,
Jens
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |