[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code
On Wed, 2015-03-18 at 17:05 +0000, George Dunlap wrote: > On 03/18/2015 04:49 PM, Dario Faggioli wrote: > >> If this would work, I think it would be a lot better. It would remove > >> the credit2-specific callback, and it would mean we wouldn't need an > >> additional test to filter out > >> > > I see. Ok, I guess I can give this a try. If it does not explode, > > because some weird dependency we can't see right now, we can either try > > harder to track it, or use the other solution outlined above as a > > fallback. > > If you could look into this, that would be awesome. :-) > So, I did look into this. In summary, I think it could work, but some rework is required. Here's my findings in some more details. So, with the quick-&-dirty(TM) patch attached to this email, Credit2 works-a-charm: root@Zhaman:~# xl dmesg |grep runqueue (XEN) Adding cpu 0 to runqueue 0 (XEN) First cpu on runqueue, activating (XEN) Adding cpu 1 to runqueue 1 (XEN) First cpu on runqueue, activating (XEN) Adding cpu 2 to runqueue 1 (XEN) Adding cpu 3 to runqueue 1 (XEN) Adding cpu 4 to runqueue 1 (XEN) Adding cpu 5 to runqueue 1 (XEN) Adding cpu 6 to runqueue 1 (XEN) Adding cpu 7 to runqueue 1 (XEN) Adding cpu 8 to runqueue 0 (XEN) Adding cpu 9 to runqueue 0 (XEN) Adding cpu 10 to runqueue 0 (XEN) Adding cpu 11 to runqueue 0 (XEN) Adding cpu 12 to runqueue 0 (XEN) Adding cpu 13 to runqueue 0 (XEN) Adding cpu 14 to runqueue 0 (XEN) Adding cpu 15 to runqueue 0 (Yes, cpu0 is assigned to the wrong runqueue, but that is because I'm ignoring the system_state <==> boot_cpu_data thing for now, so let's also ignore this, it'll be fixed). So, things are working, less complexity (i.e., no ad-hoc callback for credit2), less code (i.e., lots of '-' in the patch), outside is warm and sunny, spring is coming, birds are singing, etc... :-D :-D Unfortunately, the patch also makes Credit1 go splat like this: (XEN) Xen call trace: (XEN) [<ffff82d08012d14c>] check_lock+0x37/0x3b (XEN) [<ffff82d08012d17b>] _spin_lock+0x11/0x54 (XEN) [<ffff82d08013498f>] xmem_pool_alloc+0x11c/0x46c (XEN) [<ffff82d0801350a6>] _xmalloc+0x106/0x316 (XEN) [<ffff82d0801352c7>] _xzalloc+0x11/0x32 (XEN) [<ffff82d08011f5ae>] csched_alloc_pdata+0x47/0x1c6 (XEN) [<ffff82d0801293d6>] cpu_schedule_callback+0x5a/0xcc (XEN) [<ffff82d08011ab8a>] notifier_call_chain+0x67/0x87 (XEN) [<ffff82d08010169a>] notify_cpu_starting+0x1c/0x24 (XEN) [<ffff82d080189f5b>] start_secondary+0x203/0x256 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) Xen BUG at spinlock.c:48 (XEN) **************************************** (...and, suddenly, birds shut up :-( ) That is because pool->lock in xmem_pool_alloc() wants to find IRQs enabled when acquired, as it's been acquired with IRQs enabled before, and we don't mix IRQ enabled and disabled spinlocks (that's what the BUG at spinlock:48 is for). The difference between before and after the patch is that, bebore, csched_alloc_pdata() is called during CPU_UP_PREPARE, after, during CPU_STARTING. More specifically, CPU_UP_PREPARE callbacks are invoked as a consequence of cpu_up() being called in a loop in __start_xen(), and it itself calls notifier_call_chain(..CPU_UP_PREPARE..). This means they run: - on the boot cpu; - with interrupt enabled (see how the call to local_irq_enable() in __start_xen() is *before* the for_each_present_cpu() { cpu_up(i); } loop). OTOH, CPU_STARTING callbacks run: - on the cpu being brought up; - with interrupt disabled (see how the call to local_irq_enable(), in start_secondary(), is *after* the invocation of notify_cpu_starting()). Here we are. And the reason why things works ok in Credit2, is that csched2_alloc_pdata() doesn't really allocate anything! In fact, in general, handling alloc_pdata() during CPU_STARTING would mean that we can't allocate any memory which, given the name of the function, would look rather odd. :-) Nevertheless I see the value of doing so, and hence I think what we could do would be to introduce a new hook in the scheduler interface, called .init_pdata or .init_pcpu, and, in sched_*.c, split the allocation and the initialization parts. The former will be handled during CPU_UP_PREPARE, when allocation is possible, the latter during CPU_STARTING, when we have more info available to perform actual initializations. For Credit2, alloc_pdata() wouldn't even need to exist, and all the work will be done in init_pcpu(), so less complexity than now, and no need for the ad-hoc starting callback inside sched_credit2.c. For Credit1, both will be required, but the added complexity in sched_credit.c doesn't look too bad at all to me. Of couse, I still have to code it up, and see whether this plays well with cpupools, cpu on/offlining, etc., but I'd like to hear whether you like the idea, before getting to do that. :-) Let me know what you think. Regards, Dario --- sched_credit2.c | 63 ++------------------------------------------------------ schedule.c | 10 +++++--- 2 files changed, 9 insertions(+), 64 deletions(-) --- diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index ad0a5d4..ac7a7f2 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1918,7 +1918,8 @@ static void deactivate_runqueue(struct csched2_private *prv, int rqi) cpumask_clear_cpu(rqi, &prv->active_queues); } -static void init_pcpu(const struct scheduler *ops, int cpu) +static void * +csched2_alloc_pdata(const struct scheduler *ops, int cpu) { int rqi; unsigned long flags; @@ -1932,7 +1933,7 @@ static void init_pcpu(const struct scheduler *ops, int cpu) { printk("%s: Strange, cpu %d already initialized!\n", __func__, cpu); spin_unlock_irqrestore(&prv->lock, flags); - return; + return NULL; } /* Figure out which runqueue to put it in */ @@ -1980,20 +1981,6 @@ static void init_pcpu(const struct scheduler *ops, int cpu) spin_unlock_irqrestore(&prv->lock, flags); - return; -} - -static void * -csched2_alloc_pdata(const struct scheduler *ops, int cpu) -{ - /* Check to see if the cpu is online yet */ - /* Note: cpu 0 doesn't get a STARTING callback */ - if ( cpu == 0 || cpu_to_socket(cpu) >= 0 ) - init_pcpu(ops, cpu); - else - printk("%s: cpu %d not online yet, deferring initializatgion\n", - __func__, cpu); - return (void *)1; } @@ -2046,49 +2033,6 @@ csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) } static int -csched2_cpu_starting(int cpu) -{ - struct scheduler *ops; - - /* Hope this is safe from cpupools switching things around. :-) */ - ops = per_cpu(scheduler, cpu); - - if ( ops->alloc_pdata == csched2_alloc_pdata ) - init_pcpu(ops, cpu); - - return NOTIFY_DONE; -} - -static int cpu_credit2_callback( - struct notifier_block *nfb, unsigned long action, void *hcpu) -{ - unsigned int cpu = (unsigned long)hcpu; - int rc = 0; - - switch ( action ) - { - case CPU_STARTING: - csched2_cpu_starting(cpu); - break; - default: - break; - } - - return !rc ? NOTIFY_DONE : notifier_from_errno(rc); -} - -static struct notifier_block cpu_credit2_nfb = { - .notifier_call = cpu_credit2_callback -}; - -static int -csched2_global_init(void) -{ - register_cpu_notifier(&cpu_credit2_nfb); - return 0; -} - -static int csched2_init(struct scheduler *ops) { int i; @@ -2168,7 +2112,6 @@ const struct scheduler sched_credit2_def = { .dump_cpu_state = csched2_dump_pcpu, .dump_settings = csched2_dump, - .global_init = csched2_global_init, .init = csched2_init, .deinit = csched2_deinit, .alloc_vdata = csched2_alloc_vdata, diff --git a/xen/common/schedule.c b/xen/common/schedule.c index ef79847..5e7cdc9 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -1327,10 +1327,6 @@ static int cpu_schedule_up(unsigned int cpu) if ( idle_vcpu[cpu] == NULL ) return -ENOMEM; - if ( (ops.alloc_pdata != NULL) && - ((sd->sched_priv = ops.alloc_pdata(&ops, cpu)) == NULL) ) - return -ENOMEM; - return 0; } @@ -1348,10 +1344,16 @@ static int cpu_schedule_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned long)hcpu; + struct schedule_data *sd = &per_cpu(schedule_data, cpu); int rc = 0; switch ( action ) { + case CPU_STARTING: + if ( (ops.alloc_pdata != NULL) && + ((sd->sched_priv = ops.alloc_pdata(&ops, cpu)) == NULL) ) + return -ENOMEM; + break; case CPU_UP_PREPARE: rc = cpu_schedule_up(cpu); break; Attachment:
xen-sched-cpustarting.patch Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |