Xen project Mailing List

Re: [PATCH for 4-21 v4] xen/riscv: identify specific ISA supported by cpu

To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>

Date: Tue, 4 Feb 2025 12:47:17 +0100

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 04 Feb 2025 11:47:26 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.02.2025 08:20, Oleksii Kurochko wrote: > +/* > + * The canonical order of ISA extension names in the ISA string is defined in > + * chapter 27 of the unprivileged specification. > + * > + * The specification uses vague wording, such as should, when it comes to > + * ordering, so for our purposes the following rules apply: > + * > + * 1. All multi-letter extensions must be separated from other extensions by > an > + * underscore. > + * > + * 2. Additional standard extensions (starting with 'Z') must be sorted after > + * single-letter extensions and before any higher-privileged extensions. > + * > + * 3. The first letter following the 'Z' conventionally indicates the most > + * closely related alphabetical extension category, IMAFDQLCBKJTPVH. > + * If multiple 'Z' extensions are named, they must be ordered first by > + * category, then alphabetically within a category. > + * > + * 4. Standard supervisor-level extensions (starting with 'S') must be listed > + * after standard unprivileged extensions. If multiple supervisor-level > + * extensions are listed, they must be ordered alphabetically. > + * > + * 5. Standard machine-level extensions (starting with 'Zxm') must be listed > + * after any lower-privileged, standard extensions. If multiple > + * machine-level extensions are listed, they must be ordered > + * alphabetically. > + * > + * 6. Non-standard extensions (starting with 'X') must be listed after all > + * standard extensions. If multiple non-standard extensions are listed, > they > + * must be ordered alphabetically. > + * > + * An example string following the order is: > + * rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux > + * > + * New entries to this struct should follow the ordering rules described > above. > + * > + * Extension name must be all lowercase (according to device-tree binding) > + * and strncmp() is used in match_isa_ext() to compare extension names > instead > + * of strncasecmp(). > + */ > +const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = { > + RISCV_ISA_EXT_DATA(i, RISCV_ISA_EXT_i), > + RISCV_ISA_EXT_DATA(m, RISCV_ISA_EXT_m), > + RISCV_ISA_EXT_DATA(a, RISCV_ISA_EXT_a), > + RISCV_ISA_EXT_DATA(f, RISCV_ISA_EXT_f), > + RISCV_ISA_EXT_DATA(d, RISCV_ISA_EXT_d), > + RISCV_ISA_EXT_DATA(q, RISCV_ISA_EXT_q), > + RISCV_ISA_EXT_DATA(h, RISCV_ISA_EXT_h), > + RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), > + RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), > + RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI), > + RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), > + RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), > + RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), > + RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), > + RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), > +}; > + > +static const struct riscv_isa_ext_data __initconst required_extensions[] = { > + RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), > + RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), > + RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), > +}; Coming back to my earlier question regarding the B (pseudo-)extension: Since riscv_isa_ext[] only contains Zbb, is it precluded anywhere in the spec that DT may mention just B when all of its constituents are supported? Which gets me on to G, which is somewhat similar in nature to B. We require G when RISCV_ISA_RV64G=y, yet required_extensions[] doesn't name it or its constituents. Much like we require C when RISCV_ISA_C=y, yet it's not in the table. > +static bool is_lowercase_extension_name(const char *str) > +{ > + /* > + * `str` could contain full riscv,isa string from device tree so one > + * of the stop condionitions is checking for '_' as extensions are Nit: conditions > + * separated by '_'. > + */ > + for ( unsigned int i = 0; (str[i] != '\0') && (str[i] != '_'); i++ ) > + if ( !isdigit(str[i]) && !islower(str[i]) ) > + return false; > + > + return true; > +} > + > +static void __init match_isa_ext(const char *name, const char *name_end, > + unsigned long *bitmap) > +{ > + const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext); > + > + for ( unsigned int i = 0; i < riscv_isa_ext_count; i++ ) > + { > + const struct riscv_isa_ext_data *ext = &riscv_isa_ext[i]; > + > + /* > + * `ext->name` (according to initialization of riscv_isa_ext[] > + * elements) must be all in lowercase. > + */ > + ASSERT(is_lowercase_extension_name(ext->name)); > + > + if ( (name_end - name == strlen(ext->name)) && > + !strncmp(name, ext->name, name_end - name) ) memcmp() is generally more efficient, as it doesn't need to check for a nul terminator. > + { > + __set_bit(ext->id, bitmap); > + break; > + } > + } > +} > + > +static int __init riscv_isa_parse_string(const char *isa, > + unsigned long *out_bitmap) > +{ > + if ( (isa[0] != 'r') && (isa[1] != 'v') ) > + return -EINVAL; > + > +#if defined(CONFIG_RISCV_32) > + if ( isa[2] != '3' && isa[3] != '2' ) > + return -EINVAL; > +#elif defined(CONFIG_RISCV_64) > + if ( isa[2] != '6' && isa[3] != '4' ) > + return -EINVAL; > +#else > +# error "unsupported RISC-V bitness" > +#endif > + > + isa += 4; > + > + while ( *isa ) > + { > + const char *ext = isa++; > + const char *ext_end = isa; > + bool ext_err = false; > + > + switch ( *ext ) > + { > + case 'x': > + printk_once("Vendor extensions are ignored in riscv,isa\n"); > + /* > + * To skip an extension, we find its end. > + * As multi-letter extensions must be split from other > multi-letter > + * extensions with an "_", the end of a multi-letter extension > will > + * either be the null character or the "_" at the start of the > next > + * multi-letter extension. > + */ > + for ( ; *isa && *isa != '_'; ++isa ) > + ; > + ext_err = true; > + break; I was wondering why ext_end isn't updated here. Looks like that's because ext_err is set to true, and the checking below the switch looks at ext_err first. That's technically fine, but - to me at least - a little confusing. Could setting ext_end to NULL take the role of ext_err? For the code here this would then immediately clarify why ext_end isn't otherwise maintained. (The "err" in the name is also somewhat odd: The log message above says "ignored", and that's what the code below also does. There's nothing error-ish in here; in fact there may be nothing wrong about the specific vendor extension we're ignoring.) > + case 's': > + /* > + * Workaround for invalid single-letter 's' & 'u' (QEMU): > + * Before QEMU 7.1 it was an issue with misa to ISA string > + * conversion: > + * > https://patchwork.kernel.org/project/qemu-devel/patch/dee09d708405075420b29115c1e9e87910b8da55.1648270894.git.research_trasio@xxxxxxxxxxxx/#24792587 > + * Additional details of the workaround on Linux kernel side: > + * > https://lore.kernel.org/linux-riscv/ae93358e-e117-b43d-faad-772c529f846c@xxxxxxxxxxxx/#t > + * > + * No need to set the bit in riscv_isa as 's' & 'u' are > + * not valid ISA extensions. It works unless the first > + * multi-letter extension in the ISA string begins with > + * "Su" and is not prefixed with an underscore. > + */ > + if ( ext[-1] != '_' && ext[1] == 'u' ) > + { > + ++isa; > + ext_err = true; > + break; > + } > + fallthrough; > + case 'z': > + /* > + * Before attempting to parse the extension itself, we find its > end. > + * As multi-letter extensions must be split from other > multi-letter > + * extensions with an "_", the end of a multi-letter extension > will > + * either be the null character or the "_" at the start of the > next > + * multi-letter extension. > + * > + * Next, as the extensions version is currently ignored, we > + * eliminate that portion. This is done by parsing backwards from > + * the end of the extension, removing any numbers. This may be a > + * major or minor number however, so the process is repeated if a > + * minor number was found. > + * > + * ext_end is intended to represent the first character *after* > the > + * name portion of an extension, but will be decremented to the > last > + * character itself while eliminating the extensions version > number. > + * A simple re-increment solves this problem. > + */ > + for ( ; *isa && *isa != '_'; ++isa ) > + if ( unlikely(!isalnum(*isa)) ) > + ext_err = true; > + > + ext_end = isa; > + if ( unlikely(ext_err) ) > + break; This, otoh, is an error. Which probably shouldn't go silently. Considering the earlier remark, ext_end would then perhaps also be more logical to update after the above if(). > + if ( !isdigit(ext_end[-1]) ) > + break; > + > + while ( isdigit(*--ext_end) ) > + ; > + > + if ( tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1]) ) Leftover tolower()? > + { > + ++ext_end; > + break; > + } > + > + while ( isdigit(*--ext_end) ) > + ; > + > + ++ext_end; > + break; > + > + default: > + /* > + * Things are a little easier for single-letter extensions, as > they > + * are parsed forwards. > + * > + * After checking that our starting position is valid, we need to > + * ensure that, when isa was incremented at the start of the > loop, > + * that it arrived at the start of the next extension. > + * > + * If we are already on a non-digit, there is nothing to do. > Either > + * we have a multi-letter extension's _, or the start of an > + * extension. > + * > + * Otherwise we have found the current extension's major version > + * number. Parse past it, and a subsequent p/minor version number > + * if present. The `p` extension must not appear immediately > after > + * a number, so there is no fear of missing it. > + */ > + if ( unlikely(!isalpha(*ext)) ) > + { > + ext_err = true; > + break; > + } Like above this also looks to be a situation that maybe better would be left go entirely silently. I even wonder whether you can safely continue the parsing in that case: How do you know whether what follows is indeed the start of an extension name? > +static void __init riscv_fill_hwcap_from_isa_string(void) > +{ > + const struct dt_device_node *cpus = dt_find_node_by_path("/cpus"); > + const struct dt_device_node *cpu; > + > + if ( !cpus ) > + { > + printk("Missing /cpus node in the device tree?\n"); > + return; > + } > + > + dt_for_each_child_node(cpus, cpu) > + { > + DECLARE_BITMAP(this_isa, RISCV_ISA_EXT_MAX); > + const char *isa; > + unsigned long cpuid; > + > + if ( !dt_device_type_is_equal(cpu, "cpu") ) > + continue; > + > + if ( dt_get_cpuid_from_node(cpu, &cpuid) < 0 ) > + continue; > + > + if ( dt_property_read_string(cpu, "riscv,isa", &isa) ) > + { > + printk("Unable to find \"riscv,isa\" devicetree entry " > + "for DT's cpu%ld node\n", cpuid); > + continue; > + } > + > + for ( unsigned int i = 0; (isa[i] != '\0'); i++ ) > + if ( !isdigit(isa[i]) && (isa[i] != '_') && !islower(isa[i]) ) > + panic("According to DT binding riscv,isa must be > lowercase\n"); > + > + riscv_isa_parse_string(isa, this_isa); > + > + /* > + * In the unpriv. spec is mentioned: > + * A RISC-V ISA is defined as a base integer ISA, which must be > + * present in any implementation, plus optional extensions to > + * the base ISA. > + * What means that isa should contain, at least, I or E thereby > + * this_isa can't be empty too. > + * > + * Ensure that this_isa is not empty, so riscv_isa won't be empty > + * during initialization. This prevents ending up with extensions > + * not supported by one of the CPUs. > + */ > + ASSERT(!bitmap_empty(this_isa, RISCV_ISA_EXT_MAX)); This is again an assertion on input we consume. Afaict it would actually trigger not only on all kinds of invalid inputs, but on something valid like "rv32e". > +void __init riscv_fill_hwcap(void) > +{ > + unsigned int i; > + size_t req_extns_amount = ARRAY_SIZE(required_extensions); Imo if you really want such a local "variable", it would better be const. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.