|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 01/14] xen/common: add cache coloring common code
On 15.03.2024 11:58, Carlo Nonato wrote:
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels
> dedicated
> +to each core (hence using multiple cache units), while the last level is
> shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g. hosting another DomU). Cache coloring realizes per-set
> cache-partitioning
> +in software and mitigates this, guaranteeing higher and more predictable
> +performances for memory accesses.
Are you sure about "higher"? On an otherwise idle system, a single domain (or
vCPU) may perform better when not partitioned, as more cache would be available
to it overall.
> +How to compute the number of colors
> +###################################
> +
> +Given the linear mapping from physical memory to cache lines for granted, the
> +number of available colors for a specific platform is computed using three
> +parameters:
> +
> +- the size of the LLC.
> +- the number of the LLC ways.
> +- the page size used by Xen.
> +
> +The first two parameters can be found in the processor manual, while the
> third
> +one is the minimum mapping granularity. Dividing the cache size by the
> number of
> +its ways we obtain the size of a way. Dividing this number by the page size,
> +the number of total cache colors is found. So for example an Arm Cortex-A53
> +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages
> are
> +4 KiB in size.
> +
> +LLC size and number of ways are probed automatically by default so there's
> +should be no need to compute the number of colors by yourself.
Is this a leftover from the earlier (single) command line option?
> +Effective colors assignment
> +###########################
> +
> +When assigning colors:
> +
> +1. If one wants to avoid cache interference between two domains, different
> + colors needs to be used for their memory.
> +
> +2. To improve spatial locality, color assignment should privilege continuity
> in
s/privilege/prefer/ ?
> + the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> + domain J is better than assigning colors (0,2) to I and (1,3) to J.
While I consider 1 obvious without further explanation, the same isn't
the case for 2: What's the benefit of spatial locality? If there was
support for allocating higher order pages, I could certainly see the
point, but iirc that isn't supported (yet).
> +Command line parameters
> +***********************
> +
> +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> +
> ++----------------------+-------------------------------+
> +| **Parameter** | **Description** |
> ++----------------------+-------------------------------+
> +| ``llc-coloring`` | enable coloring at runtime |
> ++----------------------+-------------------------------+
> +| ``llc-size`` | set the LLC size |
> ++----------------------+-------------------------------+
> +| ``llc-nr-ways`` | set the LLC number of ways |
> ++----------------------+-------------------------------+
> +
> +Auto-probing of LLC specs
> +#########################
> +
> +LLC size and number of ways are probed automatically by default.
> +
> +LLC specs can be manually set via the above command line parameters. This
> +bypasses any auto-probing and it's used to overcome failing situations or for
> +debugging/testing purposes.
As well as perhaps for cases where the auto-probing logic is flawed?
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.
> Enable MSR_DEBUGCTL.LBR
> in hypervisor context to be able to dump the Last Interrupt/Exception To/From
> record with other registers.
>
> +### llc-coloring
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Flag to enable or disable LLC coloring support at runtime. This option is
> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> +cache coloring documentation for more info.
> +
> +### llc-nr-ways
> +> `= <integer>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the number of ways of the Last Level Cache. This option is available
> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are
> used
> +to find the number of supported cache colors. By default the value is
> +automatically computed by probing the hardware, but in case of specific
> needs,
> +it can be manually set. Those include failing probing and debugging/testing
> +purposes so that it's possibile to emulate platforms with different number of
> +supported colors. If set, also "llc-size" must be set, otherwise the default
> +will be used.
> +
> +### llc-size
> +> `= <size>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the size of the Last Level Cache. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to
> find
> +the number of supported cache colors. By default the value is automatically
> +computed by probing the hardware, but in case of specific needs, it can be
> +manually set. Those include failing probing and debugging/testing purposes so
> +that it's possibile to emulate platforms with different number of supported
> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> +used.
Wouldn't it make sense to infer "llc-coloring" when both of the latter options
were supplied?
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,23 @@ config NR_NUMA_NODES
> associated with multiple-nodes management. It is the upper bound of
> the number of NUMA nodes that the scheduler, memory allocation and
> other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> + bool "Last Level Cache (LLC) coloring" if EXPERT
> + depends on HAS_LLC_COLORING
> + depends on !NUMA
> +
> +config NR_LLC_COLORS
> + int "Maximum number of LLC colors"
> + range 2 1024
> + default 128
> + depends on LLC_COLORING
> + help
> + Controls the build-time size of various arrays associated with LLC
> + coloring. Refer to cache coloring documentation for how to compute the
> + number of colors supported by the platform. This is only an upper
> + bound. The runtime value is autocomputed or manually set via cmdline.
> + The default value corresponds to an 8 MiB 16-ways LLC, which should be
> + more than what's needed in the general case. Use only power of 2
> values.
I think I said so before: Rather than telling people to pick only power-of-2
values (and it remaining unclear what happens if they don't), why don't you
simply keep them from specifying anything bogus, by having them pass in the
value to use as a power of 2? I.e. "range 1 10" and "default 7" for what
you're currently putting in place.
> + 1024 is the number of colors that fit in a 4 KiB page when integers
> are 4
> + bytes long.
How's this relevant here? As a justification it would make sense to have in
the description.
I'm btw also not convinced this is a good place to put these options. Imo ...
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -71,6 +71,9 @@ config HAS_IOPORTS
> config HAS_KEXEC
> bool
>
> +config HAS_LLC_COLORING
> + bool
> +
> config HAS_PMAP
> bool
... they'd better live further down from here.
> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/param.h>
> +
> +static bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);
> +
> +static unsigned int __initdata llc_size;
> +size_param("llc-size", llc_size);
> +static unsigned int __initdata llc_nr_ways;
> +integer_param("llc-nr-ways", llc_nr_ways);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> + unsigned int i;
> +
> + printk("{ ");
> + for ( i = 0; i < num_colors; i++ )
> + {
> + unsigned int start = colors[i], end = start;
> +
> + printk("%u", start);
> +
> + for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> + ;
> +
> + if ( start != end )
> + printk("-%u", end);
> +
> + if ( i < num_colors - 1 )
> + printk(", ");
> + }
> + printk(" }\n");
> +}
> +
> +void __init llc_coloring_init(void)
> +{
> + unsigned int way_size;
> +
> + if ( !llc_coloring_enabled )
> + return;
> +
> + if ( llc_size && llc_nr_ways )
> + way_size = llc_size / llc_nr_ways;
> + else
> + {
> + way_size = get_llc_way_size();
> + if ( !way_size )
> + panic("LLC probing failed and 'llc-size' or 'llc-nr-ways'
> missing\n");
> + }
> +
> + /*
> + * The maximum number of colors must be a power of 2 in order to
> correctly
> + * map them to bits of an address.
> + */
> + max_nr_colors = way_size >> PAGE_SHIFT;
> +
> + if ( max_nr_colors & (max_nr_colors - 1) )
> + panic("Number of LLC colors (%u) isn't a power of 2\n",
> max_nr_colors);
> +
> + if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> + panic("Number of LLC colors (%u) not in range [2, %u]\n",
> + max_nr_colors, CONFIG_NR_LLC_COLORS);
Rather than crashing when max_nr_colors is too large, couldn't you simply
halve it a number of times? That would still satisfy the requirement on
isolation, wouldn't it?
> + arch_llc_coloring_init();
> +}
> +
> +void cf_check dump_llc_coloring_info(void)
I don't think cf_check is needed here nor ...
> +void cf_check domain_dump_llc_colors(const struct domain *d)
... here anymore. You're using direct calls now.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |