|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 2/2] sched_credit2.c: runqueue_per_core code
On Wed, 2015-03-18 at 17:05 +0000, George Dunlap wrote:
> On 03/18/2015 04:49 PM, Dario Faggioli wrote:
> >> If this would work, I think it would be a lot better. It would remove
> >> the credit2-specific callback, and it would mean we wouldn't need an
> >> additional test to filter out
> >>
> > I see. Ok, I guess I can give this a try. If it does not explode,
> > because some weird dependency we can't see right now, we can either try
> > harder to track it, or use the other solution outlined above as a
> > fallback.
>
> If you could look into this, that would be awesome. :-)
>
So, I did look into this. In summary, I think it could work, but some
rework is required. Here's my findings in some more details.
So, with the quick-&-dirty(TM) patch attached to this email, Credit2
works-a-charm:
root@Zhaman:~# xl dmesg |grep runqueue
(XEN) Adding cpu 0 to runqueue 0
(XEN) First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 1
(XEN) First cpu on runqueue, activating
(XEN) Adding cpu 2 to runqueue 1
(XEN) Adding cpu 3 to runqueue 1
(XEN) Adding cpu 4 to runqueue 1
(XEN) Adding cpu 5 to runqueue 1
(XEN) Adding cpu 6 to runqueue 1
(XEN) Adding cpu 7 to runqueue 1
(XEN) Adding cpu 8 to runqueue 0
(XEN) Adding cpu 9 to runqueue 0
(XEN) Adding cpu 10 to runqueue 0
(XEN) Adding cpu 11 to runqueue 0
(XEN) Adding cpu 12 to runqueue 0
(XEN) Adding cpu 13 to runqueue 0
(XEN) Adding cpu 14 to runqueue 0
(XEN) Adding cpu 15 to runqueue 0
(Yes, cpu0 is assigned to the wrong runqueue, but that is because I'm
ignoring the system_state <==> boot_cpu_data thing for now, so let's
also ignore this, it'll be fixed).
So, things are working, less complexity (i.e., no ad-hoc callback for
credit2), less code (i.e., lots of '-' in the patch), outside is warm
and sunny, spring is coming, birds are singing, etc... :-D :-D
Unfortunately, the patch also makes Credit1 go splat like this:
(XEN) Xen call trace:
(XEN) [<ffff82d08012d14c>] check_lock+0x37/0x3b
(XEN) [<ffff82d08012d17b>] _spin_lock+0x11/0x54
(XEN) [<ffff82d08013498f>] xmem_pool_alloc+0x11c/0x46c
(XEN) [<ffff82d0801350a6>] _xmalloc+0x106/0x316
(XEN) [<ffff82d0801352c7>] _xzalloc+0x11/0x32
(XEN) [<ffff82d08011f5ae>] csched_alloc_pdata+0x47/0x1c6
(XEN) [<ffff82d0801293d6>] cpu_schedule_callback+0x5a/0xcc
(XEN) [<ffff82d08011ab8a>] notifier_call_chain+0x67/0x87
(XEN) [<ffff82d08010169a>] notify_cpu_starting+0x1c/0x24
(XEN) [<ffff82d080189f5b>] start_secondary+0x203/0x256
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Xen BUG at spinlock.c:48
(XEN) ****************************************
(...and, suddenly, birds shut up :-( )
That is because pool->lock in xmem_pool_alloc() wants to find IRQs
enabled when acquired, as it's been acquired with IRQs enabled before,
and we don't mix IRQ enabled and disabled spinlocks (that's what the BUG
at spinlock:48 is for).
The difference between before and after the patch is that, bebore,
csched_alloc_pdata() is called during CPU_UP_PREPARE, after, during
CPU_STARTING. More specifically, CPU_UP_PREPARE callbacks are invoked as
a consequence of cpu_up() being called in a loop in __start_xen(), and
it itself calls notifier_call_chain(..CPU_UP_PREPARE..). This means they
run:
- on the boot cpu;
- with interrupt enabled (see how the call to local_irq_enable() in
__start_xen() is *before* the for_each_present_cpu() { cpu_up(i); }
loop).
OTOH, CPU_STARTING callbacks run:
- on the cpu being brought up;
- with interrupt disabled (see how the call to local_irq_enable(), in
start_secondary(), is *after* the invocation of
notify_cpu_starting()).
Here we are. And the reason why things works ok in Credit2, is that
csched2_alloc_pdata() doesn't really allocate anything! In fact, in
general, handling alloc_pdata() during CPU_STARTING would mean that we
can't allocate any memory which, given the name of the function, would
look rather odd. :-)
Nevertheless I see the value of doing so, and hence I think what we
could do would be to introduce a new hook in the scheduler interface,
called .init_pdata or .init_pcpu, and, in sched_*.c, split the
allocation and the initialization parts. The former will be handled
during CPU_UP_PREPARE, when allocation is possible, the latter during
CPU_STARTING, when we have more info available to perform actual
initializations.
For Credit2, alloc_pdata() wouldn't even need to exist, and all the work
will be done in init_pcpu(), so less complexity than now, and no need
for the ad-hoc starting callback inside sched_credit2.c. For Credit1,
both will be required, but the added complexity in sched_credit.c
doesn't look too bad at all to me.
Of couse, I still have to code it up, and see whether this plays well
with cpupools, cpu on/offlining, etc., but I'd like to hear whether you
like the idea, before getting to do that. :-)
Let me know what you think.
Regards,
Dario
---
sched_credit2.c | 63 ++------------------------------------------------------
schedule.c | 10 +++++---
2 files changed, 9 insertions(+), 64 deletions(-)
---
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index ad0a5d4..ac7a7f2 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1918,7 +1918,8 @@ static void deactivate_runqueue(struct csched2_private
*prv, int rqi)
cpumask_clear_cpu(rqi, &prv->active_queues);
}
-static void init_pcpu(const struct scheduler *ops, int cpu)
+static void *
+csched2_alloc_pdata(const struct scheduler *ops, int cpu)
{
int rqi;
unsigned long flags;
@@ -1932,7 +1933,7 @@ static void init_pcpu(const struct scheduler *ops, int
cpu)
{
printk("%s: Strange, cpu %d already initialized!\n", __func__, cpu);
spin_unlock_irqrestore(&prv->lock, flags);
- return;
+ return NULL;
}
/* Figure out which runqueue to put it in */
@@ -1980,20 +1981,6 @@ static void init_pcpu(const struct scheduler *ops, int
cpu)
spin_unlock_irqrestore(&prv->lock, flags);
- return;
-}
-
-static void *
-csched2_alloc_pdata(const struct scheduler *ops, int cpu)
-{
- /* Check to see if the cpu is online yet */
- /* Note: cpu 0 doesn't get a STARTING callback */
- if ( cpu == 0 || cpu_to_socket(cpu) >= 0 )
- init_pcpu(ops, cpu);
- else
- printk("%s: cpu %d not online yet, deferring initializatgion\n",
- __func__, cpu);
-
return (void *)1;
}
@@ -2046,49 +2033,6 @@ csched2_free_pdata(const struct scheduler *ops, void
*pcpu, int cpu)
}
static int
-csched2_cpu_starting(int cpu)
-{
- struct scheduler *ops;
-
- /* Hope this is safe from cpupools switching things around. :-) */
- ops = per_cpu(scheduler, cpu);
-
- if ( ops->alloc_pdata == csched2_alloc_pdata )
- init_pcpu(ops, cpu);
-
- return NOTIFY_DONE;
-}
-
-static int cpu_credit2_callback(
- struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
- unsigned int cpu = (unsigned long)hcpu;
- int rc = 0;
-
- switch ( action )
- {
- case CPU_STARTING:
- csched2_cpu_starting(cpu);
- break;
- default:
- break;
- }
-
- return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
-}
-
-static struct notifier_block cpu_credit2_nfb = {
- .notifier_call = cpu_credit2_callback
-};
-
-static int
-csched2_global_init(void)
-{
- register_cpu_notifier(&cpu_credit2_nfb);
- return 0;
-}
-
-static int
csched2_init(struct scheduler *ops)
{
int i;
@@ -2168,7 +2112,6 @@ const struct scheduler sched_credit2_def = {
.dump_cpu_state = csched2_dump_pcpu,
.dump_settings = csched2_dump,
- .global_init = csched2_global_init,
.init = csched2_init,
.deinit = csched2_deinit,
.alloc_vdata = csched2_alloc_vdata,
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index ef79847..5e7cdc9 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1327,10 +1327,6 @@ static int cpu_schedule_up(unsigned int cpu)
if ( idle_vcpu[cpu] == NULL )
return -ENOMEM;
- if ( (ops.alloc_pdata != NULL) &&
- ((sd->sched_priv = ops.alloc_pdata(&ops, cpu)) == NULL) )
- return -ENOMEM;
-
return 0;
}
@@ -1348,10 +1344,16 @@ static int cpu_schedule_callback(
struct notifier_block *nfb, unsigned long action, void *hcpu)
{
unsigned int cpu = (unsigned long)hcpu;
+ struct schedule_data *sd = &per_cpu(schedule_data, cpu);
int rc = 0;
switch ( action )
{
+ case CPU_STARTING:
+ if ( (ops.alloc_pdata != NULL) &&
+ ((sd->sched_priv = ops.alloc_pdata(&ops, cpu)) == NULL) )
+ return -ENOMEM;
+ break;
case CPU_UP_PREPARE:
rc = cpu_schedule_up(cpu);
break;
Attachment:
xen-sched-cpustarting.patch Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |