[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 10/14] xen: add cache coloring allocator for domains



Hi Jan,

On Tue, Mar 19, 2024 at 5:43 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > Add a new memory page allocator that implements the cache coloring 
> > mechanism.
> > The allocation algorithm enforces equal frequency distribution of cache
> > partitions, following the coloring configuration of a domain. This allows
> > for an even utilization of cache sets for every domain.
> >
> > Pages are stored in a color-indexed array of lists. Those lists are filled
> > by a simple init function which computes the color of each page.
> > When a domain requests a page, the allocator extract the page from the list
> > with the maximum number of free pages between those that the domain can
> > access, given its coloring configuration.
>
> Minor remark: I'm not a native speaker, but "between" here reads odd to
> me. I'd have expected perhaps "among".

Yes, I'm gonna change it.

> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -270,6 +270,20 @@ and not running softirqs. Reduce this if softirqs are 
> > not being run frequently
> >  enough. Setting this to a high value may cause boot failure, particularly 
> > if
> >  the NMI watchdog is also enabled.
> >
> > +### buddy-alloc-size (arm64)
> > +> `= <size>`
> > +
> > +> Default: `64M`
> > +
> > +Amount of memory reserved for the buddy allocator when colored allocator is
> > +active. This options is parsed only when LLC coloring support is enabled.
>
> Nit: s/parsed/used/ - the option is always parsed as long as LLC_COLORING=y.
>
> > @@ -1945,6 +1949,164 @@ static unsigned long avail_heap_pages(
> >      return free_pages;
> >  }
> >
> > +/*************************
> > + * COLORED SIDE-ALLOCATOR
> > + *
> > + * Pages are grouped by LLC color in lists which are globally referred to 
> > as the
> > + * color heap. Lists are populated in end_boot_allocator().
> > + * After initialization there will be N lists where N is the number of
> > + * available colors on the platform.
> > + */
> > +static struct page_list_head *__ro_after_init _color_heap;
> > +#define color_heap(color) (&_color_heap[color])
> > +
> > +static unsigned long *__ro_after_init free_colored_pages;
> > +
> > +/* Memory required for buddy allocator to work with colored one */
> > +#ifdef CONFIG_LLC_COLORING
> > +static unsigned long __initdata buddy_alloc_size =
> > +    MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
> > +size_param("buddy-alloc-size", buddy_alloc_size);
> > +
> > +#define domain_num_llc_colors(d) (d)->num_llc_colors
> > +#define domain_llc_color(d, i)   (d)->llc_colors[i]
> > +#else
> > +static unsigned long __initdata buddy_alloc_size;
> > +
> > +#define domain_num_llc_colors(d) 0
> > +#define domain_llc_color(d, i)   0
> > +#endif
> > +
> > +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> > +{
> > +    unsigned int color = page_to_llc_color(pg);
> > +    struct page_list_head *head = color_heap(color);
> > +
> > +    spin_lock(&heap_lock);
> > +
> > +    mark_page_free(pg, page_to_mfn(pg));
> > +
> > +    if ( need_scrub )
> > +    {
> > +        pg->count_info |= PGC_need_scrub;
> > +        poison_one_page(pg);
> > +    }
> > +
> > +    free_colored_pages[color]++;
> > +    page_list_add(pg, head);
>
> May I please ask for a comment (or at least some wording in the description)
> as to the choice made here between head or tail insertion? When assuming
> that across a system there's no sharing of colors, preferably re-using
> cache-hot pages is certainly good. Whereas when colors can reasonably be
> expected to be shared, avoiding to quickly re-use a freed page can also
> have benefits.

I'll add it.

> > +static struct page_info *alloc_color_heap_page(unsigned int memflags,
> > +                                               const struct domain *d)
> > +{
> > +    struct page_info *pg = NULL;
> > +    unsigned int i, color = 0;
> > +    unsigned long max = 0;
> > +    bool need_tlbflush = false;
> > +    uint32_t tlbflush_timestamp = 0;
> > +    bool need_scrub;
> > +
> > +    if ( memflags >> _MEMF_bits )
> > +        return NULL;
>
> By mentioning MEMF_bits earlier on I meant to give an example. What
> about MEMF_node and in particular MEMF_exact_node? Certain other flags
> also aren't obvious as to being okay to silently ignore.

You're right.

> > +    spin_lock(&heap_lock);
> > +
> > +    for ( i = 0; i < domain_num_llc_colors(d); i++ )
> > +    {
> > +        unsigned long free = free_colored_pages[domain_llc_color(d, i)];
> > +
> > +        if ( free > max )
> > +        {
> > +            color = domain_llc_color(d, i);
> > +            pg = page_list_first(color_heap(color));
> > +            max = free;
> > +        }
> > +    }
> > +
> > +    if ( !pg )
> > +    {
> > +        spin_unlock(&heap_lock);
> > +        return NULL;
> > +    }
> > +
> > +    need_scrub = pg->count_info & (PGC_need_scrub);
> > +    pg->count_info = PGC_state_inuse | (pg->count_info & PGC_colored);
>
> Better PGC_preserved?

Yeah.

> > +static void __init init_color_heap_pages(struct page_info *pg,
> > +                                         unsigned long nr_pages)
> > +{
> > +    unsigned int i;
> > +    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
> > +
> > +    if ( buddy_alloc_size )
> > +    {
> > +        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), 
> > nr_pages);
> > +
> > +        init_heap_pages(pg, buddy_pages);
>
> There's a corner case where init_heap_pages() would break when passed 0
> as 2nd argument.

I don't see it. There's just a for-loop that would be skipped in that case...

> I think you want to alter the enclosing if() to
> "if ( buddy_alloc_size >= PAGE_SIZE )" to be entirely certain to avoid
> that case.

... anyway, ok.

> > +static void dump_color_heap(void)
> > +{
> > +    unsigned int color;
> > +
> > +    printk("Dumping color heap info\n");
> > +    for ( color = 0; color < get_max_nr_llc_colors(); color++ )
> > +        if ( free_colored_pages[color] > 0 )
> > +            printk("Color heap[%u]: %lu pages\n",
> > +                   color, free_colored_pages[color]);
> > +}
>
> While having all of the code above from here outside of any #ifdef is
> helpful to prevent unintended breakage when changes are made and tested
> only on non-Arm64 targets, I'd still like to ask: Halfway recent
> compilers manage to eliminate everything? I'd like to avoid e.g. x86
> being left with traces of coloring despite not being able at all to use
> it.

I don't know the answer to this, sorry.

> > @@ -2485,7 +2660,10 @@ struct page_info *alloc_domheap_pages(
> >          }
> >          if ( assign_page(pg, order, d, memflags) )
> >          {
> > -            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > +            if ( pg->count_info & PGC_colored )
> > +                free_color_heap_page(pg, memflags & MEMF_no_scrub);
> > +            else
> > +                free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> >              return NULL;
> >          }
> >      }
> > @@ -2568,7 +2746,10 @@ void free_domheap_pages(struct page_info *pg, 
> > unsigned int order)
> >              scrub = 1;
> >          }
> >
> > -        free_heap_pages(pg, order, scrub);
> > +        if ( pg->count_info & PGC_colored )
> > +            free_color_heap_page(pg, scrub);
> > +        else
> > +            free_heap_pages(pg, order, scrub);
> >      }
>
> Instead of this, did you consider altering free_heap_pages() to forward
> to free_color_heap_page()? That would then also allow to have a single,
> central comment and/or assertion that the "order" value here isn't lost.

Yes this can be easily done.

> Jan

Thanks.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.