|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 01/12] xen/arm: enable SVE extension for Xen
Hi Luca,
> On 23 May 2023, at 09:43, Luca Fancellu <Luca.Fancellu@xxxxxxx> wrote:
>
> Enable Xen to handle the SVE extension, add code in cpufeature module
> to handle ZCR SVE register, disable trapping SVE feature on system
> boot only when SVE resources are accessed.
> While there, correct coding style for the comment on coprocessor
> trapping.
>
> Now cptr_el2 is part of the domain context and it will be restored
> on context switch, this is a preparation for saving the SVE context
> which will be part of VFP operations, so restore it before the call
> to save VFP registers.
> To save an additional isb barrier, restore cptr_el2 before an
> existing isb barrier and move the call for saving VFP context after
> that barrier. To keep a (mostly) specularity of ctxt_switch_to()
> and ctxt_switch_from(), move vfp_save_state() up in the function.
>
> Change the KConfig entry to make ARM64_SVE symbol selectable, by
> default it will be not selected.
>
> Create sve module and sve_asm.S that contains assembly routines for
> the SVE feature, this code is inspired from linux and it uses
> instruction encoding to be compatible with compilers that does not
> support SVE, imported instructions are documented in
> README.LinuxPrimitives.
>
> Signed-off-by: Luca Fancellu <luca.fancellu@xxxxxxx>
Reviewed-by: Bertrand Marquis <bertrand.marquis@xxxxxxx>
with one minor NIT that could be fixed on commit...
> ---
> Changes from v6:
> - modified licence, add emacs block, move vfp_save_state up in the
> function, add comments to CPTR_EL2 and vfp_restore_state, don't
> use variable in init_traps(), code style fixes,
> add entries to README.LinuxPrimitives (Julien)
> - vl_to_zcr is moved into sve.c module as changes to the series led
> to its usage only inside it, remove stub for compute_max_zcr and
> rely on compiler DCE.
> Changes from v5:
> - Add R-by Bertrand
> Changes from v4:
> - don't use fixed types in vl_to_zcr, forgot to address that in
> v3, by mistake I changed that in patch 2, fixing now (Jan)
> Changes from v3:
> - no changes
> Changes from v2:
> - renamed sve_asm.S in sve-asm.S, new files should not contain
> underscore in the name (Jan)
> Changes from v1:
> - Add assert to vl_to_zcr, it is never called with vl==0, but just
> to be sure it won't in the future.
> Changes from RFC:
> - Moved restoring of cptr before an existing barrier (Julien)
> - Marked the feature as unsupported for now (Julien)
> - Trap and un-trap only when using SVE resources in
> compute_max_zcr() (Julien)
> ---
> xen/arch/arm/Kconfig | 10 ++--
> xen/arch/arm/README.LinuxPrimitives | 9 ++++
> xen/arch/arm/arm64/Makefile | 1 +
> xen/arch/arm/arm64/cpufeature.c | 7 ++-
> xen/arch/arm/arm64/sve-asm.S | 48 +++++++++++++++++++
> xen/arch/arm/arm64/sve.c | 59 ++++++++++++++++++++++++
> xen/arch/arm/cpufeature.c | 6 ++-
> xen/arch/arm/domain.c | 20 +++++---
> xen/arch/arm/include/asm/arm64/sve.h | 27 +++++++++++
> xen/arch/arm/include/asm/arm64/sysregs.h | 1 +
> xen/arch/arm/include/asm/cpufeature.h | 14 ++++++
> xen/arch/arm/include/asm/domain.h | 1 +
> xen/arch/arm/include/asm/processor.h | 2 +
> xen/arch/arm/setup.c | 5 +-
> xen/arch/arm/traps.c | 27 ++++++-----
> 15 files changed, 210 insertions(+), 27 deletions(-)
> create mode 100644 xen/arch/arm/arm64/sve-asm.S
> create mode 100644 xen/arch/arm/arm64/sve.c
> create mode 100644 xen/arch/arm/include/asm/arm64/sve.h
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 239d3aed3c7f..41f45d8d1203 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -112,11 +112,15 @@ config ARM64_PTR_AUTH
> This feature is not supported in Xen.
>
> config ARM64_SVE
> - def_bool n
> + bool "Enable Scalar Vector Extension support (UNSUPPORTED)" if UNSUPPORTED
> depends on ARM_64
> help
> - Scalar Vector Extension support.
> - This feature is not supported in Xen.
> + Scalar Vector Extension (SVE/SVE2) support for guests.
> +
> + Please be aware that currently, enabling this feature will add latency on
> + VM context switch between SVE enabled guests, between not-enabled SVE
> + guests and SVE enabled guests and viceversa, compared to the time
> + required to switch between not-enabled SVE guests.
>
> config ARM64_MTE
> def_bool n
> diff --git a/xen/arch/arm/README.LinuxPrimitives
> b/xen/arch/arm/README.LinuxPrimitives
> index 1d53e6a898da..76c8df29e416 100644
> --- a/xen/arch/arm/README.LinuxPrimitives
> +++ b/xen/arch/arm/README.LinuxPrimitives
> @@ -62,6 +62,15 @@ done
> linux/arch/arm64/lib/clear_page.S xen/arch/arm/arm64/lib/clear_page.S
> linux/arch/arm64/lib/copy_page.S unused in Xen
>
> +---------------------------------------------------------------------
> +
> +SVE assembly macro: last sync @ v6.3.0 (last commit: 457391b03803)
> +
> +linux/arch/arm64/include/asm/fpsimdmacros.h
> xen/arch/arm/include/asm/arm64/sve-asm.S
> +
> +The following macros were taken from Linux:
> + _check_general_reg, _check_num, _sve_rdvl
> +
> =====================================================================
> arm32
> =====================================================================
> diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> index 28481393e98f..54ad55c75cda 100644
> --- a/xen/arch/arm/arm64/Makefile
> +++ b/xen/arch/arm/arm64/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_LIVEPATCH) += livepatch.o
> obj-y += mm.o
> obj-y += smc.o
> obj-y += smpboot.o
> +obj-$(CONFIG_ARM64_SVE) += sve.o sve-asm.o
> obj-y += traps.o
> obj-y += vfp.o
> obj-y += vsysreg.o
> diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
> index d9039d37b2d1..b4656ff4d80f 100644
> --- a/xen/arch/arm/arm64/cpufeature.c
> +++ b/xen/arch/arm/arm64/cpufeature.c
> @@ -455,15 +455,11 @@ static const struct arm64_ftr_bits ftr_id_dfr1[] = {
> ARM64_FTR_END,
> };
>
> -#if 0
> -/* TODO: use this to sanitize SVE once we support it */
> -
> static const struct arm64_ftr_bits ftr_zcr[] = {
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
> ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0), /* LEN */
> ARM64_FTR_END,
> };
> -#endif
>
> /*
> * Common ftr bits for a 32bit register with all hidden, strict
> @@ -603,6 +599,9 @@ void update_system_features(const struct cpuinfo_arm *new)
>
> SANITIZE_ID_REG(zfr64, 0, aa64zfr0);
>
> + if ( cpu_has_sve )
> + SANITIZE_REG(zcr64, 0, zcr);
> +
> /*
> * Comment from Linux:
> * Userspace may perform DC ZVA instructions. Mismatched block sizes
> diff --git a/xen/arch/arm/arm64/sve-asm.S b/xen/arch/arm/arm64/sve-asm.S
> new file mode 100644
> index 000000000000..4d1549344733
> --- /dev/null
> +++ b/xen/arch/arm/arm64/sve-asm.S
> @@ -0,0 +1,48 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Arm SVE assembly routines
> + *
> + * Copyright (C) 2022 ARM Ltd.
> + *
> + * Some macros and instruction encoding in this file are taken from linux
> 6.1.1,
> + * file arch/arm64/include/asm/fpsimdmacros.h, some of them are a modified
> + * version.
> + */
> +
> +/* Sanity-check macros to help avoid encoding garbage instructions */
> +
> +.macro _check_general_reg nr
> + .if (\nr) < 0 || (\nr) > 30
> + .error "Bad register number \nr."
> + .endif
> +.endm
> +
> +.macro _check_num n, min, max
> + .if (\n) < (\min) || (\n) > (\max)
> + .error "Number \n out of range [\min,\max]"
> + .endif
> +.endm
> +
> +/* SVE instruction encodings for non-SVE-capable assemblers */
> +/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
> +
> +/* RDVL X\nx, #\imm */
> +.macro _sve_rdvl nx, imm
> + _check_general_reg \nx
> + _check_num (\imm), -0x20, 0x1f
> + .inst 0x04bf5000 \
> + | (\nx) \
> + | (((\imm) & 0x3f) << 5)
> +.endm
> +
> +/* Gets the current vector register size in bytes */
> +GLOBAL(sve_get_hw_vl)
> + _sve_rdvl 0, 1
> + ret
> +
> +/*
> + * Local variables:
> + * mode: ASM
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
> new file mode 100644
> index 000000000000..e05ccc38a896
> --- /dev/null
> +++ b/xen/arch/arm/arm64/sve.c
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Arm SVE feature code
> + *
> + * Copyright (C) 2022 ARM Ltd.
> + */
> +
> +#include <xen/types.h>
> +#include <asm/arm64/sve.h>
> +#include <asm/arm64/sysregs.h>
> +#include <asm/processor.h>
> +#include <asm/system.h>
> +
> +extern unsigned int sve_get_hw_vl(void);
> +
> +/* Takes a vector length in bits and returns the ZCR_ELx encoding */
> +static inline register_t vl_to_zcr(unsigned int vl)
> +{
> + ASSERT(vl > 0);
> + return ((vl / SVE_VL_MULTIPLE_VAL) - 1U) & ZCR_ELx_LEN_MASK;
> +}
> +
> +register_t compute_max_zcr(void)
> +{
> + register_t cptr_bits = get_default_cptr_flags();
> + register_t zcr = vl_to_zcr(SVE_VL_MAX_BITS);
> + unsigned int hw_vl;
> +
> + /* Remove trap for SVE resources */
> + WRITE_SYSREG(cptr_bits & ~HCPTR_CP(8), CPTR_EL2);
> + isb();
> +
> + /*
> + * Set the maximum SVE vector length, doing that we will know the VL
> + * supported by the platform, calling sve_get_hw_vl()
> + */
> + WRITE_SYSREG(zcr, ZCR_EL2);
> +
> + /*
> + * Read the maximum VL, which could be lower than what we imposed before,
> + * hw_vl contains VL in bytes, multiply it by 8 to use vl_to_zcr() later
> + */
> + hw_vl = sve_get_hw_vl() * 8U;
> +
> + /* Restore CPTR_EL2 */
> + WRITE_SYSREG(cptr_bits, CPTR_EL2);
> + isb();
> +
> + return vl_to_zcr(hw_vl);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index c4ec38bb2554..83b84368f6d5 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -9,6 +9,7 @@
> #include <xen/init.h>
> #include <xen/smp.h>
> #include <xen/stop_machine.h>
> +#include <asm/arm64/sve.h>
> #include <asm/cpufeature.h>
>
> DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
> @@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
>
> c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
>
> + if ( cpu_has_sve )
> + c->zcr64.bits[0] = compute_max_zcr();
> +
> c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>
> c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
> @@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
> guest_cpuinfo.pfr64.mpam = 0;
> guest_cpuinfo.pfr64.mpam_frac = 0;
>
> - /* Hide SVE as Xen does not support it */
> + /* Hide SVE by default to the guests */
Everything is for guests and as Jan mentioned in an other comment
this could be wrongly interpreted.
Here I would suggest to just stick to:
/* Hide SVE by default */
Cheers
Bertrand
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |