[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
From: George Dunlap <george.dunlap@xxxxxxxxxx>
Date: Tue, 8 Mar 2016 12:02:19 +0000
Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Feng Wu <feng.wu@xxxxxxxxx>
Delivery-date: Tue, 08 Mar 2016 12:03:05 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 07/03/16 15:53, Konrad Rzeszutek Wilk wrote:
> On Mon, Mar 07, 2016 at 11:21:33AM +0000, George Dunlap wrote:
>> On Fri, Mar 4, 2016 at 10:00 PM, Konrad Rzeszutek Wilk
>> <konrad.wilk@xxxxxxxxxx> wrote:
>>>> +/* Handle VT-d posted-interrupt when VCPU is blocked. */
>>>> +static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
>>>> +{
>>>> +    struct arch_vmx_struct *vmx, *tmp;
>>>> +    spinlock_t *lock = &per_cpu(vmx_pi_blocking, smp_processor_id()).lock;
>>>> +    struct list_head *blocked_vcpus =
>>>> +             &per_cpu(vmx_pi_blocking, smp_processor_id()).list;
>>>> +
>>>> +    ack_APIC_irq();
>>>> +    this_cpu(irq_count)++;
>>>> +
>>>> +    spin_lock(lock);
>>>> +
>>>> +    /*
>>>> +     * XXX: The length of the list depends on how many vCPU is current
>>>> +     * blocked on this specific pCPU. This may hurt the interrupt latency
>>>> +     * if the list grows to too many entries.
>>>> +     */
>>>> +    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
>>>> +    {
>>>
>>>
>>> My recollection of the 'most-horrible' case of this being really bad is when
>>> the scheduler puts the vCPU0 and VCPU1 of the guest on the same pCPU (as an 
>>> example)
>>> and they round-robin all the time.
>>>
>>> <handwaving>
>>> Would it be perhaps possible to have an anti-affinity flag to deter the
>>> scheduler from this? That is whichever struct vcpu has 'anti-affinity' flag
>>> set - the scheduler will try as much as it can _to not_ schedule the 
>>> 'struct vcpu'
>>> if the previous 'struct vcpu' had this flag as well on this pCPU?
>>
>> Well having vcpus from the same guest on the same pcpu is problematic
>> for a number of reasons -- spinlocks first and foremost.  So in
>> general trying to avoid that would be useful for most guests.
> 
> PV ticketlocks in HVM and PV guests make this "manageable".
> 
>>
>> The thing with scheduling is that it's a bit like economics: it seems
>> simple but it's actually not at all obvious what the emergent behavior
>> will be from adding a simple rule. :-)
> 
> <nods>
>>
>> On the whole it seems unlikely that having two vcpus on a single pcpu
>> is a "stable" situation -- it's likely to be pretty transient, and
>> thus not have a major impact on performance.
> 
> Except that we are concerned with it - in fact we are disabling this
> feature because it may happen. How do we make sure it does not happen
> all the time? Or at least do some back-off if things do get
> in this situation.

So it's disabled by default based on a theoretical fear that it *may*
cause performance problems, but without any actual performance problems
having been observed?

It seems like there are a couple of ways we could approach this:

1. Try to optimize the reverse look-up code so that it's not a linear
linked list (getting rid of the theoretical fear)

2. Try to test engineered situations where we expect this to be a
problem, to see how big of a problem it is (proving the theory to be
accurate or inaccurate in this case)

3. Turn the feature on by default as soon as the 4.8 window opens up,
perhaps with some sort of a check that runs when in debug mode that
looks for the condition we're afraid of happening and BUG()s.  If we run
a full development cycle without anyone hitting the bug in testing, then
we just leave the feature on.

Then we'll only look at adding complexity to the scheduler if there's
actually a problem to solve.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Wu, Feng

References:
- [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Konrad Rzeszutek Wilk

Prev by Date: Re: [Xen-devel] [PATCH v6 for Xen 4.7 1/4] xen: enable per-VCPU parameter settings for RTDS scheduler
Next by Date: Re: [Xen-devel] [PATCH 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization.
Previous by thread: Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
Next by thread: Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.