[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 3/3] xen/sched: fix cpu hotplug
On 01/09/2022 07:11, Juergen Gross wrote: > On 01.09.22 00:52, Andrew Cooper wrote: >> On 16/08/2022 11:13, Juergen Gross wrote: >>> Cpu cpu unplugging is calling schedule_cpu_rm() via stop_machine_run() >> >> Cpu cpu. >> >>> with interrupts disabled, thus any memory allocation or freeing must >>> be avoided. >>> >>> Since commit 5047cd1d5dea ("xen/common: Use enhanced >>> ASSERT_ALLOC_CONTEXT in xmalloc()") this restriction is being enforced >>> via an assertion, which will now fail. >>> >>> Before that commit cpu unplugging in normal configurations was working >>> just by chance as only the cpu performing schedule_cpu_rm() was doing >>> active work. With core scheduling enabled, however, failures could >>> result from memory allocations not being properly propagated to other >>> cpus' TLBs. >> >> This isn't accurate, is it? The problem with initiating a TLB flush >> with IRQs disabled is that you can deadlock against a remote CPU which >> is waiting for you to enable IRQs first to take a TLB flush IPI. > > As long as only one cpu is trying to allocate/free memory during the > stop_machine_run() action the deadlock won't happen. > >> How does a memory allocation out of the xenheap result in a TLB flush? >> Even with split heaps, you're only potentially allocating into a new >> slot which was unused... > > Yeah, you are right. The main problem would occur only when a virtual > address is changed to point at another physical address, which should be > quite unlikely. > > I can drop that paragraph, as it doesn't really help. Yeah, I think that would be best. >> >>> diff --git a/xen/common/sched/cpupool.c b/xen/common/sched/cpupool.c >>> index 58e082eb4c..2506861e4f 100644 >>> --- a/xen/common/sched/cpupool.c >>> +++ b/xen/common/sched/cpupool.c >>> @@ -411,22 +411,28 @@ int cpupool_move_domain(struct domain *d, >>> struct cpupool *c) >>> } >>> /* Update affinities of all domains in a cpupool. */ >>> -static void cpupool_update_node_affinity(const struct cpupool *c) >>> +static void cpupool_update_node_affinity(const struct cpupool *c, >>> + struct affinity_masks *masks) >>> { >>> - struct affinity_masks masks; >>> + struct affinity_masks local_masks; >>> struct domain *d; >>> - if ( !update_node_aff_alloc(&masks) ) >>> - return; >>> + if ( !masks ) >>> + { >>> + if ( !update_node_aff_alloc(&local_masks) ) >>> + return; >>> + masks = &local_masks; >>> + } >>> rcu_read_lock(&domlist_read_lock); >>> for_each_domain_in_cpupool(d, c) >>> - domain_update_node_aff(d, &masks); >>> + domain_update_node_aff(d, masks); >>> rcu_read_unlock(&domlist_read_lock); >>> - update_node_aff_free(&masks); >>> + if ( masks == &local_masks ) >>> + update_node_aff_free(masks); >>> } >>> /* >> >> Why do we need this at all? domain_update_node_aff() already knows what >> to do when passed NULL, so this seems like an awfully complicated no-op. > > You do realize that update_node_aff_free() will do something in case > masks > was initially NULL? By "this", I meant the entire hunk, not just the final if(). What is wrong with passing the (possibly NULL) masks pointer straight down into domain_update_node_aff() and removing all the memory allocation in this function? But I've also answered that by looking more clearly. It's about not repeating the memory allocations/freeing for each domain in the pool. Which does make sense as this is a slow path already, but the complexity here of having conditionally allocated masks is far from simple. > >> >>> @@ -1008,10 +1016,21 @@ static int cf_check cpu_callback( >>> { >>> unsigned int cpu = (unsigned long)hcpu; >>> int rc = 0; >>> + static struct cpu_rm_data *mem; >>> switch ( action ) >>> { >>> case CPU_DOWN_FAILED: >>> + if ( system_state <= SYS_STATE_active ) >>> + { >>> + if ( mem ) >>> + { >> >> So, this does compile (and indeed I've tested the result), but I can't >> see how it should. >> >> mem is guaranteed to be uninitialised at this point, and ... > > ... it is defined as "static", so it is clearly NULL initially. Oh, so it is. That is hiding particularly well in plain sight. Can it please be this: @@ -1014,9 +1014,10 @@ void cf_check dump_runq(unsigned char key) static int cf_check cpu_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { + static struct cpu_rm_data *mem; /* Protected by cpu_add_remove_lock */ + unsigned int cpu = (unsigned long)hcpu; int rc = 0; - static struct cpu_rm_data *mem; switch ( action ) { We already split static and non-static variable like this elsewhere for clarity, and identifying the lock which protects the data is useful for anyone coming to this in the future. ~Andrew P.S. as an observation, this isn't safe for parallel AP booting, but I guarantee that this isn't the only example which would want fixing if we wanted to get parallel booting working.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |