[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

On 08/07/2017 04:18 AM, Jan Beulich wrote:
>>>> Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> 07/31/17 10:03 PM >>>
>> We have limited number (slightly under NR_DYNAMIC_VECTORS=192) of IRQ
>> vectors that are available to each processor. Currently, when x2apic
>> cluster mode is used (which is default), each vector is shared among
>> all processors in the cluster. With many IRQs (as is the case on systems
>> with multiple SR-IOV cards) and few clusters (e.g. single socket)
>> there is a good chance that we will run out of vectors.
>> This patch tries to decrease vector sharing between processors by
>> assigning vector to a single processor if the assignment request (via
>> __assign_irq_vector()) comes without explicitly specifying which
>> processors are expected to share the interrupt. This typically happens
>> during boot time (or possibly PCI hotplug) when create_irq(NUMA_NO_NODE)
>> is called. When __assign_irq_vector() is called from
>> set_desc_affinity() which provides sharing mask, vector sharing will
>> continue to be performed, as before.
> Wouldn't it be sufficient for people running into vector shortage due to
> sharing to specify "x2apic_phys" on the command line?

Yes, it would.

The downside is that an installer (e.g. anaconda) would then have to
figure out which systems need to be installed with this specific option.
Even worse, this problem would show up on systems that have some of the
processors taken offline (in BIOS) so during initial installation there
would be no reason to switch to physical. Or a card might be added after
the installation.

In other words, I think x2apic_phys is more of a workaround in this case.

>> @@ -72,9 +73,13 @@ static void __init clustered_apic_check_x2apic(void)
>> {
>> }
>  >
>> -static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
>> +static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
>> +    const cpumask_t *cpumask)
>> {
>> -    return per_cpu(cluster_cpus, cpu);
>> +    if ( cpumask != TARGET_CPUS )
> Is a pointer comparison (rather than a content one) here really correct in
> the general case?

When the caller is using TARGET_CPUS this is an indication that it
explicitly didn't care about sharing (in assign_irq_vector()). A caller
might want to have sharing on and provide a mask that has the same
content as TARGET_CPUS but is stored in a different location. This will
allow vector_allocation_cpumask_x2apic_cluster() to distinguish it from
a "don't care" case.

>> +        return per_cpu(cluster_cpus, cpu);
>> +    else
>> +        return cpumask_of(cpu);
> Please avoid the "else" in cases like this.
>> --- a/xen/include/asm-x86/mach-generic/mach_apic.h
>> +++ b/xen/include/asm-x86/mach-generic/mach_apic.h
>> @@ -13,10 +13,11 @@
>> #define INT_DELIVERY_MODE (genapic->int_delivery_mode)
>> #define INT_DEST_MODE (genapic->int_dest_mode)
>> #define TARGET_CPUS      (genapic->target_cpus())
>> -#define init_apic_ldr (genapic->init_apic_ldr)
>> -#define clustered_apic_check (genapic->clustered_apic_check) 
>> -#define cpu_mask_to_apicid (genapic->cpu_mask_to_apicid)
>> -#define vector_allocation_cpumask(cpu) 
>> (genapic->vector_allocation_cpumask(cpu))
>> +#define INIT_APIC_LDR (genapic->init_apic_ldr)
>> +#define CLUSTERED_APIC_CHECK (genapic->clustered_apic_check) 
>> +#define CPU_MASK_TO_APICID (genapic->cpu_mask_to_apicid)
>> +#define VECTOR_ALLOCATION_CPUMASK(cpu, mask) \
>> +    (genapic->vector_allocation_cpumask(cpu, mask))
> I don't see the connection of the change in case of all of these to the 
> purpose
> of this patch.

I need to include asm-x86/mach-generic/mach_apic.h in x2apic.c and
keeping those 4 as lower-case would mess up static initializers of
apic_x2apic_phys and apic_x2apic_cluster. Besides, switching them to
upper case would make them consistent with (what I think are) similar
definitions of the 3 top macros above.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.