|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC] xen/arm64: livepatch: enable attaching callbacks
Hi Andrew and Roger,
(I'm still looking into the feedback from
Roger but will reply shortly, thanks!)
On Mon, 29 Jun 2026 17:33:40 +0100, Andrew Cooper wrote:
>On 29/06/2026 3:01 am, Ryo Takakura wrote:
>> Linux ftrace allows registering callbacks which is useful
>> for debugging and tracing events. On Linux, it is done by
>> reserving function entry points at compile time which can
>> later be patched to branch to a trampoline.
>>
>> This patch implements similar callback feature, but with
>> different approach using existing livepatch infrastructure.
>> Instead of reserving function entry points at compile time,
>> the traced function will be livepatched so that it branches
>> to the trampoline.
>>
>> The role of the trampoline(illustrated below) is to preserve
>> the context while jumping to the tracer function, and return
>> back to the traced function with its context restored.
>>
>> trampoline:
>> Save regs
>> Call tracer function
>> Restore regs
>> old_addr
>> return old_addr + 4
>>
>> One can request the feature by setting @trampoline_buf to 1
>> which will allocate a buffer for trampoline.
>>
>> Signed-off-by: Ryo Takakura <takakura@xxxxxxxxxxxxx>
>
>Having something a bit more like Linux tracing would be nice. But, this
>is very different to the other livepatching functionality and a few bits
>don't match nicely.
>
>First, you write a lot of the trampoline manually. You can do most of
>this in the target function with
>__attribute__((no_caller_saved_registers)), avoiding the need to do it
>by hand. This would require a minimum GCC of 7 (where our baseline is
>5) but it's acceptable for new features to require a newer compiler.
I wasn't aware of the attribute. I agree using it for the sake of
avoiding manual save/restore. I'll look into using it.
>Secondly, what happens if the instruction at old_addr is an ADRP, or a
>branch? Right now, there's no case where we move an instruction; we
>only produce new code, and branch from old to new.
>
>When you're moving the instruction at old_addr, you must compensate for
>any IP-relative component. Also you can in principle have a conditional
>branch as the first instruction, which gives you two branches to fix up
>at the end of the trampoline, rather than one.
>
>On x86, you've got an additional problem that it's generally more than
>one instruction, and rarely an exect number of instructions overwritten
>at old_addr.
>
>Some high level comments, leaving aside the details until the above
>questions are better understood.
Thanks for the suggestion here!
I can come up with two solutions based on my understanding:
1. Fix the instruction when copying to trampoline
2. Reserve an empty function preamble(just like Linux)
I think the 1st approach can be rather easily implemented on arm
given its fixed instruction length. (Or maybe we can simply check
if its safe to be copied and reject otherwise?)
But I believe x86 with varying instruction size wouldn't be
as easy as arm. So as suggestted by Roger on the following email,
maybe better to add an empty function preamble?
>> diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c
>> index e135bd5bf9..b7c9aba94e 100644
>> --- a/xen/arch/arm/arm64/livepatch.c
>> +++ b/xen/arch/arm/arm64/livepatch.c
>> @@ -34,12 +57,87 @@ void arch_livepatch_apply(const struct livepatch_func
>> *func,
>> /* Save old ones. */
>> memcpy(state->insn_buffer, func->old_addr, len);
>>
>> - if ( func->new_addr )
>> + if ( !func->new_addr )
>> + {
>> + insn = aarch64_insn_gen_nop();
>> + }
>> + else if ( func->trampoline_buf )
>> + {
>> + int rc;
>> + uint32_t *trampoline = func->trampoline_buf;
>> + uint32_t *tp = trampoline;
>> + void *orig_cont_addr = (void *)func->old_addr + len;
>> + unsigned int trampoline_code_size = len + 12 * ARCH_PATCH_INSN_SIZE;
>> + unsigned long trampoline_start = (unsigned long)trampoline &
>> PAGE_MASK;
>> + unsigned long trampoline_end =
>> + PAGE_ALIGN((unsigned long)trampoline + trampoline_code_size);
>> +
>> + /*
>> + * Make the payload text area writeable while generating
>> + * the trampoline instructions.
>> + */
>> + rc = modify_xen_mappings(trampoline_start, trampoline_end,
>> + PAGE_HYPERVISOR);
>> + if ( rc )
>> + {
>> + printk(XENLOG_ERR LIVEPATCH
>> + "Failed to make trampoline writable: %d\n", rc);
>> + return;
>> + }
>
>This ought not to be necessary.
>
>The trampoline is executable code, so should have space reserved for it
>in .text of the livepatch.
>
>Then, you can identify it simply by references in a new section, without
>having to have a pointer with a sentinel value (void *)1 in (which MISRA
>will have a fit at).
I like this idea as well! I'll try this together with the earlier
suggestion using __attribute__((no_caller_saved_registers)).
>> +
>> + /* Save state before calling the tracer. */
>> + *tp++ = aarch64_insn_gen_stp_pre(0, 1);
>> + *tp++ = aarch64_insn_gen_stp_pre(2, 3);
>> + *tp++ = aarch64_insn_gen_stp_pre(4, 5);
>> + *tp++ = aarch64_insn_gen_stp_pre(6, 7);
>> + *tp++ = aarch64_insn_gen_stp_pre(29, 30);
>> +
>> + /* Call user's tracing function. */
>> + insn = aarch64_insn_gen_branch_imm(
>> + (unsigned long)tp,
>> + (unsigned long)func->new_addr,
>> + AARCH64_INSN_BRANCH_LINK);
>> + *tp++ = insn;
>> +
>> + /* Restore state before continuing original function. */
>> + *tp++ = aarch64_insn_gen_ldp_post(29, 30);
>> + *tp++ = aarch64_insn_gen_ldp_post(6, 7);
>> + *tp++ = aarch64_insn_gen_ldp_post(4, 5);
>> + *tp++ = aarch64_insn_gen_ldp_post(2, 3);
>> + *tp++ = aarch64_insn_gen_ldp_post(0, 1);
>> +
>> + /* Original instruction. */
>> + memcpy(tp, state->insn_buffer, len);
>> + tp += len / ARCH_PATCH_INSN_SIZE;
>> +
>> + /* Branch back to original function. */
>> + insn = aarch64_insn_gen_branch_imm(
>> + (unsigned long)tp,
>> + (unsigned long)orig_cont_addr,
>> + AARCH64_INSN_BRANCH_NOLINK);
>> + *tp++ = insn;
>> +
>> + clean_and_invalidate_dcache_va_range(trampoline,
>> trampoline_code_size);
>> +
>> + rc = modify_xen_mappings(trampoline_start, trampoline_end,
>> + PAGE_HYPERVISOR_RX);
>> + if ( rc )
>> + {
>> + printk(XENLOG_ERR LIVEPATCH
>> + "Failed to restore trampoline RX mapping: %d\n", rc);
>> + return;
>> + }
>> +
>> + /* Branch from original function to trampoline. */
>> + insn = aarch64_insn_gen_branch_imm(
>> + (unsigned long)func->old_addr,
>> + (unsigned long)func->trampoline_buf,
>> + AARCH64_INSN_BRANCH_NOLINK);
>
>This entire block wants breaking out into a function for writing the
>trampoline. It does not want to live inline in arch_livepatch_apply().
I'll fix this.
>> diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
>> index 45c8924f34..7a81763cf2 100644
>> --- a/xen/include/xen/livepatch.h
>> +++ b/xen/include/xen/livepatch.h
>> @@ -48,6 +48,8 @@ struct xen_sysctl_livepatch_op;
>> #define ELF_LIVEPATCH_POSTREVERT_HOOK ".livepatch.hooks.postrevert"
>> /* Arbitrary limit for payload size and .bss section size. */
>> #define LIVEPATCH_MAX_SIZE MB(2)
>> +/* Size of a trampoline used for function tracing */
>> +#define LIVEPATCH_TRAMPOLINE_SIZE 128
>
>This is a common header. How have you calculate 128?
>
>At best, it's an Aarch64 specific number, but if you reserve space
>properly in .text then it won't even matter, I don't think.
The value was arbitrary which i thought would be enough
for buffer... This should be taken care with the suggested
approach as said.
Sincerely,
Ryo Takakura
>~Andrew
________________________________________
差出人: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
送信日時: 2026年6月30日 1:33
宛先: Ryo Takakura; xen-devel@xxxxxxxxxxxxxxxxxxxx
CC: Andrew Cooper; roger.pau@xxxxxxxxxx; ross.lagerwall@xxxxxxxxxx;
sstabellini@xxxxxxxxxx; julien@xxxxxxx; bertrand.marquis@xxxxxxx;
michal.orzel@xxxxxxx; Volodymyr_Babchuk@xxxxxxxx; anthony.perard@xxxxxxxxxx;
jbeulich@xxxxxxxx; Hirokazu Takahashi; Koichiro Den
件名: Re: [RFC] xen/arm64: livepatch: enable attaching callbacks
On 29/06/2026 3:01 am, Ryo Takakura wrote:
> Linux ftrace allows registering callbacks which is useful
> for debugging and tracing events. On Linux, it is done by
> reserving function entry points at compile time which can
> later be patched to branch to a trampoline.
>
> This patch implements similar callback feature, but with
> different approach using existing livepatch infrastructure.
> Instead of reserving function entry points at compile time,
> the traced function will be livepatched so that it branches
> to the trampoline.
>
> The role of the trampoline(illustrated below) is to preserve
> the context while jumping to the tracer function, and return
> back to the traced function with its context restored.
>
> trampoline:
> Save regs
> Call tracer function
> Restore regs
> old_addr
> return old_addr + 4
>
> One can request the feature by setting @trampoline_buf to 1
> which will allocate a buffer for trampoline.
>
> Signed-off-by: Ryo Takakura <takakura@xxxxxxxxxxxxx>
Having something a bit more like Linux tracing would be nice. But, this
is very different to the other livepatching functionality and a few bits
don't match nicely.
First, you write a lot of the trampoline manually. You can do most of
this in the target function with
__attribute__((no_caller_saved_registers)), avoiding the need to do it
by hand. This would require a minimum GCC of 7 (where our baseline is
5) but it's acceptable for new features to require a newer compiler.
Secondly, what happens if the instruction at old_addr is an ADRP, or a
branch? Right now, there's no case where we move an instruction; we
only produce new code, and branch from old to new.
When you're moving the instruction at old_addr, you must compensate for
any IP-relative component. Also you can in principle have a conditional
branch as the first instruction, which gives you two branches to fix up
at the end of the trampoline, rather than one.
On x86, you've got an additional problem that it's generally more than
one instruction, and rarely an exect number of instructions overwritten
at old_addr.
Some high level comments, leaving aside the details until the above
questions are better understood.
> diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c
> index e135bd5bf9..b7c9aba94e 100644
> --- a/xen/arch/arm/arm64/livepatch.c
> +++ b/xen/arch/arm/arm64/livepatch.c
> @@ -34,12 +57,87 @@ void arch_livepatch_apply(const struct livepatch_func
> *func,
> /* Save old ones. */
> memcpy(state->insn_buffer, func->old_addr, len);
>
> - if ( func->new_addr )
> + if ( !func->new_addr )
> + {
> + insn = aarch64_insn_gen_nop();
> + }
> + else if ( func->trampoline_buf )
> + {
> + int rc;
> + uint32_t *trampoline = func->trampoline_buf;
> + uint32_t *tp = trampoline;
> + void *orig_cont_addr = (void *)func->old_addr + len;
> + unsigned int trampoline_code_size = len + 12 * ARCH_PATCH_INSN_SIZE;
> + unsigned long trampoline_start = (unsigned long)trampoline &
> PAGE_MASK;
> + unsigned long trampoline_end =
> + PAGE_ALIGN((unsigned long)trampoline + trampoline_code_size);
> +
> + /*
> + * Make the payload text area writeable while generating
> + * the trampoline instructions.
> + */
> + rc = modify_xen_mappings(trampoline_start, trampoline_end,
> + PAGE_HYPERVISOR);
> + if ( rc )
> + {
> + printk(XENLOG_ERR LIVEPATCH
> + "Failed to make trampoline writable: %d\n", rc);
> + return;
> + }
This ought not to be necessary.
The trampoline is executable code, so should have space reserved for it
in .text of the livepatch.
Then, you can identify it simply by references in a new section, without
having to have a pointer with a sentinel value (void *)1 in (which MISRA
will have a fit at).
> +
> + /* Save state before calling the tracer. */
> + *tp++ = aarch64_insn_gen_stp_pre(0, 1);
> + *tp++ = aarch64_insn_gen_stp_pre(2, 3);
> + *tp++ = aarch64_insn_gen_stp_pre(4, 5);
> + *tp++ = aarch64_insn_gen_stp_pre(6, 7);
> + *tp++ = aarch64_insn_gen_stp_pre(29, 30);
> +
> + /* Call user's tracing function. */
> + insn = aarch64_insn_gen_branch_imm(
> + (unsigned long)tp,
> + (unsigned long)func->new_addr,
> + AARCH64_INSN_BRANCH_LINK);
> + *tp++ = insn;
> +
> + /* Restore state before continuing original function. */
> + *tp++ = aarch64_insn_gen_ldp_post(29, 30);
> + *tp++ = aarch64_insn_gen_ldp_post(6, 7);
> + *tp++ = aarch64_insn_gen_ldp_post(4, 5);
> + *tp++ = aarch64_insn_gen_ldp_post(2, 3);
> + *tp++ = aarch64_insn_gen_ldp_post(0, 1);
> +
> + /* Original instruction. */
> + memcpy(tp, state->insn_buffer, len);
> + tp += len / ARCH_PATCH_INSN_SIZE;
> +
> + /* Branch back to original function. */
> + insn = aarch64_insn_gen_branch_imm(
> + (unsigned long)tp,
> + (unsigned long)orig_cont_addr,
> + AARCH64_INSN_BRANCH_NOLINK);
> + *tp++ = insn;
> +
> + clean_and_invalidate_dcache_va_range(trampoline,
> trampoline_code_size);
> +
> + rc = modify_xen_mappings(trampoline_start, trampoline_end,
> + PAGE_HYPERVISOR_RX);
> + if ( rc )
> + {
> + printk(XENLOG_ERR LIVEPATCH
> + "Failed to restore trampoline RX mapping: %d\n", rc);
> + return;
> + }
> +
> + /* Branch from original function to trampoline. */
> + insn = aarch64_insn_gen_branch_imm(
> + (unsigned long)func->old_addr,
> + (unsigned long)func->trampoline_buf,
> + AARCH64_INSN_BRANCH_NOLINK);
This entire block wants breaking out into a function for writing the
trampoline. It does not want to live inline in arch_livepatch_apply().
> diff --git a/xen/include/xen/livepatch.h b/xen/include/xen/livepatch.h
> index 45c8924f34..7a81763cf2 100644
> --- a/xen/include/xen/livepatch.h
> +++ b/xen/include/xen/livepatch.h
> @@ -48,6 +48,8 @@ struct xen_sysctl_livepatch_op;
> #define ELF_LIVEPATCH_POSTREVERT_HOOK ".livepatch.hooks.postrevert"
> /* Arbitrary limit for payload size and .bss section size. */
> #define LIVEPATCH_MAX_SIZE MB(2)
> +/* Size of a trampoline used for function tracing */
> +#define LIVEPATCH_TRAMPOLINE_SIZE 128
This is a common header. How have you calculate 128?
At best, it's an Aarch64 specific number, but if you reserve space
properly in .text then it won't even matter, I don't think.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |