[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 3/3] x86/alternatives: relax apply BUGs during runtime
On 23/09/2024 2:06 pm, Roger Pau Monné wrote: > On Mon, Sep 23, 2024 at 12:17:51PM +0100, Andrew Cooper wrote: >> On 20/09/2024 10:36 am, Roger Pau Monne wrote: >>> alternatives is used both at boot time, and when loading livepatch payloads. >>> While for the former it makes sense to panic, it's not useful for the >>> later, as >>> for livepatches it's possible to fail to load the livepatch if alternatives >>> cannot be resolved and continue operating normally. >>> >>> Relax the BUGs in _apply_alternatives() to instead return an error code if >>> alternatives are applied after boot. >>> >>> Add a dummy livepatch test so the new logic can be assessed, with the change >>> applied the loading now fails with: >>> >>> alt table ffff82d040602024 -> ffff82d040602032 >>> livepatch: xen_alternatives_fail applying alternatives failed: -22 >>> >>> Signed-off-by: Roger Pau Monné <roge.rpau@xxxxxxxxxx> >>> --- >>> xen/arch/x86/alternative.c | 29 ++++++++++++++++------ >>> xen/arch/x86/include/asm/alternative.h | 3 ++- >>> xen/common/livepatch.c | 10 +++++++- >>> xen/test/livepatch/Makefile | 5 ++++ >>> xen/test/livepatch/xen_alternatives_fail.c | 29 ++++++++++++++++++++++ >>> 5 files changed, 66 insertions(+), 10 deletions(-) >>> create mode 100644 xen/test/livepatch/xen_alternatives_fail.c >>> >>> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c >>> index 7824053c9d33..c0912cb12ea5 100644 >>> --- a/xen/arch/x86/alternative.c >>> +++ b/xen/arch/x86/alternative.c >>> @@ -174,10 +174,13 @@ extern void *const __initdata_cf_clobber_end[]; >>> * The caller will set the "force" argument to true for the final >>> * invocation, such that no CALLs/JMPs to NULL pointers will be left >>> * around. See also the further comment below. >>> + * >>> + * Note the function cannot fail if system_state < SYS_STATE_active, it >>> would >>> + * panic instead. The return value is only meaningful for runtime usage. >>> */ >>> -static void init_or_livepatch _apply_alternatives(struct alt_instr *start, >>> - struct alt_instr *end, >>> - bool force) >>> +static int init_or_livepatch _apply_alternatives(struct alt_instr *start, >>> + struct alt_instr *end, >>> + bool force) >>> { >>> struct alt_instr *a, *base; >>> >>> @@ -198,9 +201,17 @@ static void init_or_livepatch >>> _apply_alternatives(struct alt_instr *start, >>> uint8_t buf[MAX_PATCH_LEN]; >>> unsigned int total_len = a->orig_len + a->pad_len; >>> >>> - BUG_ON(a->repl_len > total_len); >>> - BUG_ON(total_len > sizeof(buf)); >>> - BUG_ON(a->cpuid >= NCAPINTS * 32); >>> +#define BUG_ON_BOOT(cond) \ >>> + if ( system_state < SYS_STATE_active ) \ >>> + BUG_ON(cond); \ >>> + else if ( cond ) \ >>> + return -EINVAL; >>> + >>> + BUG_ON_BOOT(a->repl_len > total_len); >>> + BUG_ON_BOOT(total_len > sizeof(buf)); >>> + BUG_ON_BOOT(a->cpuid >= NCAPINTS * 32); >>> + >>> +#undef BUG_ON_BOOT >> The "error handling" before was horrible and this isn't really any better. >> >> This function should always return int, printing more helpful info than >> that (a printk() says a thousand things better than a BUG()), and >> nmi_apply_alternatives can panic() rather than leaving these BUG()s here. > OK, will rework the logic here so it's the caller that panics (or not) > as necessary, and _apply_alternatives() always prints some error > message. Yes. With alternatives now behind sensible checks in the livepatch case, any failure here is relevant and wants printing. > >> As a tangent, the __must_check doesn't seem to have been applied to >> nmi_apply_alternatives(), but I'd suggest dropping the attribute; I >> don't think it adds much. > I didn't see the value in adding the attribute to > nmi_apply_alternatives(), as in that context _apply_alternatives() > would unconditionally panic instead of returning an error code. Ah, it was only apply_alternatives() you made __must_check, not _apply_alternatives(), which is why there isn't a compile error in nmi_apply_alternatives(). Still, I don't think the __must_check is much use. > >>> diff --git a/xen/test/livepatch/xen_alternatives_fail.c >>> b/xen/test/livepatch/xen_alternatives_fail.c >>> new file mode 100644 >>> index 000000000000..01d289095a31 >>> --- /dev/null >>> +++ b/xen/test/livepatch/xen_alternatives_fail.c >>> @@ -0,0 +1,29 @@ >>> +/* >>> + * Copyright (c) 2024 Cloud Software Group. >>> + * >>> + */ >>> + >>> +#include "config.h" >>> +#include <xen/lib.h> >>> +#include <xen/livepatch_payload.h> >>> + >>> +#include <asm/alternative.h> >>> +#include <asm/cpuid.h> >>> + >>> +void test_alternatives(void) >>> +{ >>> + alternative("", "", NCAPINTS * 32); >>> +} >>> + >>> +/* Set a hook so the loading logic in Xen don't consider the payload >>> empty. */ >>> +LIVEPATCH_LOAD_HOOK(test_alternatives); >>> + >>> +/* >>> + * Local variables: >>> + * mode: C >>> + * c-file-style: "BSD" >>> + * c-basic-offset: 4 >>> + * tab-width: 4 >>> + * indent-tabs-mode: nil >>> + * End: >>> + */ >> The second half of the patch (new testcase) is all fine and good, but >> should pass with patch 2 in place? I'd suggest splitting it out. > No, not really. The Xen buildid for this patch will be correctly set > to match the running one, but the alternatives feature CPUID is > explicitly set to an out of range value (NCAPINTS * 32) to trigger the > BUG_ON condition. Ah yes. Good point. In which case it probably ought to stay in this patch. > Further thinking about it, I think we should add a build time assert > that the feature parameters in the alternative calls are smaller than > NCAPINTS * 32. A build check where? It's quite hard to do in alternatives themselves. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |