|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH v1] Replace tasklets with per-cpu implementation.
>>> On 08.09.14 at 21:01, <konrad.wilk@xxxxxxxxxx> wrote:
> On Wed, Sep 03, 2014 at 09:03:45AM +0100, Jan Beulich wrote:
>> >>> On 02.09.14 at 22:28, <konrad.wilk@xxxxxxxxxx> wrote:
>> > I typed this prototype up and asked folks with the right hardware to
>> > test it. It _ought_ to, thought I think that the tasklet code
>> > still could use an overhaul.
>>
>> Apart from needing general cleaning up, the main question I have
>> is whether the dpci_kill() part couldn't be replaced by obtaining a
>> proper reference on the domain when scheduling a softirq, and
>> dropping that reference after having handled it.
>
>
> The one fatal corner is when we call 'hvm_dirq_assist' with
> d->arch.hvm_domain.irq.dpci set to NULL. There is even an ASSERT in
> 'hvm_dirq_assist' for that.
>
> The one edge condition I am troubled by is when we receive an
> interrupt and process it - while on anothre CPU we get an hypercall
> for 'domain_kill'.
>
> That is:
> a) 'do_IRQ'-> .. schedule_dpci_for(d)' had been called (and so
> d->arch.hvm_domain.irq.dpci is still valid).
>
> b) on another CPU 'domain_kill' calls domain_relinquish_resources
> >pci_release_devices->pci_clean_dpci_irqs which then makes
> d->arch.hvm_domain.irq.dpci NULL.
>
> c). the 'schedule_tail' (vmx_do_resume) right after a) is called
> which means we call the do_softirq. The 'dpci_softirq' is called
> which calls 'hvm_dirq_assist' and blows up.
>
> d). the 'domain_relinquish_resources' on another CPU continues on trying
> to tear down the domain.
I have to admit that I don't immediately see why this would be a
problem with the new softirq approach, but not with the previous
tasklet one. In any event, couldn't domain_relinquish_resources()
wait for the list of scheduled entities to become empty, while not
inserting new entries onto the list when ->is_dying?
> +static void schedule_dpci_for(struct domain *d)
> +{
> + if ( !test_and_set_bit(STATE_SCHED, &d->state) )
> + {
> + unsigned long flags;
> + struct list_head *list;
> +
> + local_irq_save(flags);
> + INIT_LIST_HEAD(&d->list);
> + get_domain(d); /* To be put by 'dpci_softirq' */
get_knownalive_domain()?
> +static void migrate_tasklets_from_cpu(unsigned int cpu, struct list_head
> *list)
tasklets?
> +{
> + struct domain *d;
> +
> + while ( !list_empty(list) )
> + {
> + d = list_entry(list->next, struct domain, list);
> + list_del(&d->list);
> + schedule_dpci_for(d);
> + }
> +}
> +
> +static int cpu_callback(
> + struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> + unsigned int cpu = (unsigned long)hcpu;
> +
> + switch ( action )
> + {
> + case CPU_UP_PREPARE:
> + INIT_LIST_HEAD(&per_cpu(dpci_list, cpu));
> + break;
> + case CPU_UP_CANCELED:
> + case CPU_DEAD:
> + migrate_tasklets_from_cpu(cpu, (&per_cpu(dpci_list, cpu)));
Can CPUs go down while softirqs are pending on them?
> + break;
> + default:
> + break;
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block cpu_nfb = {
> + .notifier_call = cpu_callback,
> + .priority = 99
If the whole notification is needed, this seemingly arbitrary number
would need a comment.
> --- a/xen/include/xen/softirq.h
> +++ b/xen/include/xen/softirq.h
> @@ -8,6 +8,7 @@ enum {
> NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
> RCU_SOFTIRQ,
> TASKLET_SOFTIRQ,
> + HVM_DPCI_SOFTIRQ,
> NR_COMMON_SOFTIRQS
> };
This should of course also be conditional upon HAS_PASSTHROUGH.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |