[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] Xen: Spread boot time page scrubbing across all available CPU's



On Thu, Apr 03, 2014 at 11:00:07AM +0200, Tim Deegan wrote:
> At 21:19 -0400 on 02 Apr (1396469958), Konrad Rzeszutek Wilk wrote:
> > On Tue, Apr 01, 2014 at 03:29:05PM -0400, Konrad Rzeszutek Wilk wrote:
> > > Is there an updated version of this patch?
> > > 
> > > My recollection is that Jan wanted this patch to spread the work
> > > only on the cores and not the threads. Were there any other concerns?
> > 
> > And for fun I tried it on a large box and it does make a difference.
> > 
> > With the patch (attached) where we only use the cores and skip
> > the SMT threads the boot time scrubbing for 1.5TB is 66 seconds
> > while for all of the CPUs (including threads) is 127 seconds.
> > 
> > Sorry about the quality of the patch - was poking at it while
> > on conference calls so please treat it as HACK/RFC.
> > 
> > Jan/Malcom, were there any other issues that needed to be addressed?
> 
> Yes: http://lists.xen.org/archives/html/xen-devel/2013-10/msg00131.html

Pardon for pasting the comments. I can't find the email in my mailbox.

Regarding your comments:

>> +            region[i].chunk_size = chunk_size;
>> +            region[i].offset = mfn_off;

>Not sure these need to be per-node values since they're always the same.
>Also, since thsy're always the same, this doesn't need to be in a
>for_each_online_node() loop: you could just set the shared offset
>variable and use total_node_cpus_mask to trigger all the workers at once

While I agree with the 'region[i].offset = mfn_off' always being the same
and making it a visible value to the workers is good, I think the other
suggestions has a potential problem. Let me explain after this comment:


>> +            cpumask_clear_cpu(cpu, &node_cpus);
>> .. snip..
>You could also call it with @wait == 1 and then you don't need to
>maintain your own bootscrub_count.

The issue with using @wait and also total_node_cpus_mask is that
the workers have no idea to which node they belong. That is the
parameter we pass in: &region[i] won't work anymore as the
'i' is based on NUMA count and not the CPU count.

An option would be for the worker to do:

void __init smp_scrub_heap_pages(void *data)
{
        bool_t scrub_bsp = (bool_t)data;
        unsigned long node_id = cpu_to_node(smp_processor_id());
        struct scrub_region *data;

        if (node_id == NUMA_NO_NODE)
                if (scrub_bsp)
                        data = region[0];
                else
                        goto out;
        else
                data = region[node_id];

and use that. Then the parameter passed in is just whether this
is the first loop (false) or the second loop (true).


Or perhaps:

>You're struggling a little to force this information into the SMP
>protocol; I think it would be cleaner if you carved the central loop of
>smp_scrub_heap_pages() out into its own funciton and called that here
>instead.

Just do that and then the workers will just quit if they detect
(node_id == NUMA_NO_NODE) and there is no parameter parsing.

Let me try that.                

> 
> Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.