[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] Xen: Spread boot time page scrubbing across all available CPU's
Hi, Again, I'm very much in favour of this in principle, but I have some comments on implementation -- some of which I made last time as well! At 13:35 +0100 on 30 Sep (1380548117), Malcolm Crossley wrote: > +static atomic_t __initdata bootscrub_count = ATOMIC_INIT(0); > + > +struct scrub_region { > + u64 offset; > + u64 start; > + u64 chunk_size; > + u64 cpu_block_size; Unsigned long, please: your actual scrubbing code uses unsigned long local varables to work with these numbers anyway. > + local_node = cpu_to_node(cpu); > + /* Determine if we are scrubbing using the boot CPU */ > + if ( region->cpu_block_size != ~0ULL ) Please define a correctly typed macro for this constant. (Actually, having read the calling side I think this constant won't be needed at all) > @@ -1283,15 +1315,124 @@ void __init scrub_heap_pages(void) > if ( (mfn % ((100*1024*1024)/PAGE_SIZE)) == 0 ) > printk("."); Since the scrub order is pretty much arbitrary this printk isn't much of an indicator of progress. Maybe replace it with a printk in the dispatch loop instead? > + /* Do the scrub if possible */ > + if ( page_state_is(pg, free) ) You already checked this a couple of lines above, and since you're not taking the lock there's no need to re-check. > + scrub_one_page(pg); > + } There should be a wmb() here, to make sure the main scrub dispatcher can't exit while the last worker is still issuing writes. > + /* Increment count to indicate scrubbing complete on this CPU */ > + atomic_dec(&bootscrub_count); > +} > + > +/* > + * Scrub all unallocated pages in all heap zones. This function uses all > + * online cpu's to scrub the memory in parallel. > + */ > +void __init scrub_heap_pages(void) > +{ > + cpumask_t node_cpus, total_node_cpus_mask = {{ 0 }}; > + unsigned int i, boot_cpu_node, total_node_cpus, cpu = smp_processor_id(); > + unsigned long mfn, mfn_off, chunk_size, max_cpu_blk_size = 0; > + unsigned long mem_start, mem_end; > + > + if ( !opt_bootscrub ) > + return; > + > + boot_cpu_node = cpu_to_node(cpu); > + > + printk("Scrubbing Free RAM: "); > + > + /* Scrub block size */ > + chunk_size = opt_bootscrub_blocksize >> PAGE_SHIFT; The naming is a little distracting here; in general you're using 'block' to mean the amount that a CPU will scrub altogether, and 'chunk' for the amount scrubbed at once -- maybe this command-line option should be 'chunk' too? > + if ( chunk_size == 0 ) > + chunk_size = 1; > + > + /* Determine the amount of memory to scrub, per CPU on each Node */ > + for_each_online_node ( i ) > + { > + /* Calculate Node memory start and end address */ > + mem_start = max(node_start_pfn(i), first_valid_mfn); > + mem_end = min(mem_start + node_spanned_pages(i), max_page); DYM min(node_start_pfn(i) + node_spanned_pages(i), max_page)? > + /* Divide by number of CPU's for this node */ > + node_cpus = node_to_cpumask(i); Shouldn't this be masked with cpu_online_map? > + /* It's possible a node has no CPU's */ > + if ( cpumask_empty(&node_cpus) ) > + continue; > + cpumask_or(&total_node_cpus_mask, &total_node_cpus_mask, &node_cpus); > + > + region[i].cpu_block_size = (mem_end - mem_start) / > + cpumask_weight(&node_cpus); What if there's a remainder in this division? I don't see anything that will scrub the leftover frames. > + region[i].start = mem_start; > + > + if ( region[i].cpu_block_size > max_cpu_blk_size ) > + max_cpu_blk_size = region[i].cpu_block_size; > + } > + > + /* Round default chunk size down if required */ > + if ( max_cpu_blk_size && chunk_size > max_cpu_blk_size ) > + chunk_size = max_cpu_blk_size; Is this necessary? The worker will confine itself to the block size anyway. > + total_node_cpus = cpumask_weight(&total_node_cpus_mask); > + /* Start all CPU's scrubbing memory, chunk_size at a time */ > + for ( mfn_off = 0; mfn_off < max_cpu_blk_size; mfn_off += chunk_size ) > + { > + process_pending_softirqs(); > + > + atomic_set(&bootscrub_count, total_node_cpus); > + > spin_lock(&heap_lock); > > - /* Re-check page status with lock held. */ > - if ( page_state_is(pg, free) ) > - scrub_one_page(pg); > + /* Start all other CPU's on all nodes */ > + for_each_online_node ( i ) > + { > + region[i].chunk_size = chunk_size; > + region[i].offset = mfn_off; Not sure these need to be per-node values since they're always the same. Also, since thsy're always the same, this doesn't need to be in a for_each_online_node() loop: you could just set the shared offset variable and use total_node_cpus_mask to trigger all the workers at once. > + node_cpus = node_to_cpumask(i); Again, masked with cpu_online_map? > + /* Clear local cpu ID */ > + cpumask_clear_cpu(cpu, &node_cpus); AIUI, on_selected_cpus will DTRT here even if you're calling yourself. You could also call it with @wait == 1 and then you don't need to maintain your own bootscrub_count. > + /* Start page scrubbing on all other CPU's */ > + on_selected_cpus(&node_cpus, smp_scrub_heap_pages, ®ion[i], > 0); > + } > + > + /* Start scrub on local CPU if CPU linked to a memory node */ > + if ( boot_cpu_node != NUMA_NO_NODE ) > + smp_scrub_heap_pages(®ion[boot_cpu_node]); > + > + /* Wait for page scrubbing to complete on all other CPU's */ > + while ( atomic_read(&bootscrub_count) > 0 ) > + cpu_relax(); > > spin_unlock(&heap_lock); > } > > + /* Use the boot CPU to scrub any nodes which have no CPU's linked to > them */ > + for_each_online_node ( i ) > + { > + node_cpus = node_to_cpumask(i); > + > + if ( !cpumask_empty(&node_cpus) ) > + continue; > + > + mem_start = max(node_start_pfn(i), first_valid_mfn); > + mem_end = min(mem_start + node_spanned_pages(i), max_page); Again, ITYM node_start_pfn(i) + node_spanned_pages(i). > + region[0].offset = 0; > + region[0].cpu_block_size = ~0ULL; > + > + for ( mfn = mem_start; mfn < mem_end; mfn += chunk_size ) > + { > + spin_lock(&heap_lock); > + if ( mfn + chunk_size > mem_end ) > + region[0].chunk_size = mem_end - mfn; > + else > + region[0].chunk_size = chunk_size; > + > + region[0].start = mfn; > + You're struggling a little to force this information into the SMP protocol; I think it would be cleaner if you carved the central loop of smp_scrub_heap_pages() out into its own funciton and called that here instead. > + smp_scrub_heap_pages(®ion[0]); > + spin_unlock(&heap_lock); > + process_pending_softirqs(); This belongs at the top of the loop, I think, since the last softirq check was before the last iteration of the main loop. Cheers, Tim. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |