Xen project Mailing List

Re: [XEN PATCH v5 7/7] xen/arm: ffa: support notification

To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>

From: Jens Wiklander <jens.wiklander@xxxxxxxxxx>

Date: Mon, 3 Jun 2024 11:01:12 +0200

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "patches@xxxxxxxxxx" <patches@xxxxxxxxxx>, Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>

Delivery-date: Mon, 03 Jun 2024 09:01:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Bertrand, On Fri, May 31, 2024 at 4:28 PM Bertrand Marquis <Bertrand.Marquis@xxxxxxx> wrote: > > Hi Jens, > > > On 29 May 2024, at 09:25, Jens Wiklander <jens.wiklander@xxxxxxxxxx> wrote: > > > > Add support for FF-A notifications, currently limited to an SP (Secure > > Partition) sending an asynchronous notification to a guest. > > > > Guests and Xen itself are made aware of pending notifications with an > > interrupt. The interrupt handler triggers a tasklet to retrieve the > > notifications using the FF-A ABI and deliver them to their destinations. > > > > Update ffa_partinfo_domain_init() to return error code like > > ffa_notif_domain_init(). > > > > Signed-off-by: Jens Wiklander <jens.wiklander@xxxxxxxxxx> > > --- > > v4->v5: > > - Move the freeing of d->arch.tee to the new TEE mediator free_domain_ctx > > callback to make the context accessible during rcu_lock_domain_by_id() > > from a tasklet > > - Initialize interrupt handlers for secondary CPUs from the new TEE mediator > > init_interrupt() callback > > - Restore the ffa_probe() from v3, that is, remove the > > presmp_initcall(ffa_init) approach and use ffa_probe() as usual now that we > > have the init_interrupt callback. > > - A tasklet is added to handle the Schedule Receiver interrupt. The tasklet > > finds each relevant domain with rcu_lock_domain_by_id() which now is enough > > to guarantee that the FF-A context can be accessed. > > - The notification interrupt handler only schedules the notification > > tasklet mentioned above > > > > v3->v4: > > - Add another note on FF-A limitations > > - Clear secure_pending in ffa_handle_notification_get() if both SP and SPM > > bitmaps are retrieved > > - ASSERT that ffa_rcu_lock_domain_by_vm_id() isn't passed the vm_id FF-A > > uses for Xen itself > > - Replace the get_domain_by_id() call done via ffa_get_domain_by_vm_id() in > > notif_irq_handler() with a call to rcu_lock_live_remote_domain_by_id() via > > ffa_rcu_lock_domain_by_vm_id() > > - Remove spinlock in struct ffa_ctx_notif and use atomic functions as needed > > to access and update the secure_pending field > > - In notif_irq_handler(), look for the first online CPU instead of assuming > > that the first CPU is online > > - Initialize FF-A via presmp_initcall() before the other CPUs are online, > > use register_cpu_notifier() to install the interrupt handler > > notif_irq_handler() > > - Update commit message to reflect recent updates > > > > v2->v3: > > - Add a GUEST_ prefix and move FFA_NOTIF_PEND_INTR_ID and > > FFA_SCHEDULE_RECV_INTR_ID to public/arch-arm.h > > - Register the Xen SRI handler on each CPU using on_selected_cpus() and > > setup_irq() > > - Check that the SGI ID retrieved with FFA_FEATURE_SCHEDULE_RECV_INTR > > doesn't conflict with static SGI handlers > > > > v1->v2: > > - Addressing review comments > > - Change ffa_handle_notification_{bind,unbind,set}() to take struct > > cpu_user_regs *regs as argument. > > - Update ffa_partinfo_domain_init() and ffa_notif_domain_init() to return > > an error code. > > - Fixing a bug in handle_features() for FFA_FEATURE_SCHEDULE_RECV_INTR. > > --- > > xen/arch/arm/tee/Makefile | 1 + > > xen/arch/arm/tee/ffa.c | 72 +++++- > > xen/arch/arm/tee/ffa_notif.c | 409 ++++++++++++++++++++++++++++++++ > > xen/arch/arm/tee/ffa_partinfo.c | 9 +- > > xen/arch/arm/tee/ffa_private.h | 56 ++++- > > xen/arch/arm/tee/tee.c | 2 +- > > xen/include/public/arch-arm.h | 14 ++ > > 7 files changed, 548 insertions(+), 15 deletions(-) > > create mode 100644 xen/arch/arm/tee/ffa_notif.c > > [...] > > > > @@ -517,8 +567,10 @@ err_rxtx_destroy: > > static const struct tee_mediator_ops ffa_ops = > > { > > .probe = ffa_probe, > > + .init_interrupt = ffa_notif_init_interrupt, > > With the previous change on init secondary i would suggest to > have a ffa_init_secondary here actually calling the ffa_notif_init_interrupt. Yes, that makes sense. I'll update. > > > .domain_init = ffa_domain_init, > > .domain_teardown = ffa_domain_teardown, > > + .free_domain_ctx = ffa_free_domain_ctx, > > .relinquish_resources = ffa_relinquish_resources, > > .handle_call = ffa_handle_call, > > }; > > diff --git a/xen/arch/arm/tee/ffa_notif.c b/xen/arch/arm/tee/ffa_notif.c > > new file mode 100644 > > index 000000000000..e8e8b62590b3 > > --- /dev/null > > +++ b/xen/arch/arm/tee/ffa_notif.c [...] > > +static void notif_vm_pend_intr(uint16_t vm_id) > > +{ > > + struct ffa_ctx *ctx; > > + struct domain *d; > > + struct vcpu *v; > > + > > + /* > > + * vm_id == 0 means a notifications pending for Xen itself, but > > + * we don't support that yet. > > + */ > > + if ( !vm_id ) > > + return; > > + > > + d = ffa_rcu_lock_domain_by_vm_id(vm_id); > > + if ( !d ) > > + return; > > + > > + ctx = d->arch.tee; > > + if ( !ctx ) > > + goto out_unlock; > > In both previous cases you are silently ignoring an interrupt > due to an internal error. > Is this something that we should trace ? maybe just debug ? > > Could you add a comment to explain why this could happen > (when possible) or not and the possible side effects ? > > The second one would be a notification for a domain without > FF-A enabled which is ok but i am a bit more wondering on > the first one. The SPMC must be out of sync in both cases. I've been looking for a window where that can happen, but I can't find any. SPMC is called with FFA_NOTIFICATION_BITMAP_DESTROY during domain teardown so the SPMC shouldn't try to deliver any notifications after that. In the second case, the domain ID might have been reused for a domain without FF-A enabled, but the SPMC should have known that already. I'll add comments and prints since these errors indicate a bug somewhere. > > > + > > + /* > > + * arch.tee is freed from complete_domain_destroy() so the RCU lock > > + * guarantees that the data structure isn't freed while we're accessing > > + * it. > > + */ > > + ACCESS_ONCE(ctx->notif.secure_pending) = true; > > + > > + /* > > + * Since we're only delivering global notification, always > > + * deliver to the first online vCPU. It doesn't matter > > + * which we chose, as long as it's available. > > + */ > > + for_each_vcpu(d, v) > > + { > > + if ( is_vcpu_online(v) ) > > + { > > + vgic_inject_irq(d, v, GUEST_FFA_NOTIF_PEND_INTR_ID, > > + true); > > + break; > > + } > > + } > > + if ( !v ) > > + printk(XENLOG_ERR "ffa: can't inject NPI, all vCPUs offline\n"); > > + > > +out_unlock: > > + rcu_unlock_domain(d); > > +} Thanks, Jens

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.