[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 07/15] xen/riscv: introduce tracking of pending vCPU interrupts, part 1


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 13 Jan 2026 14:54:38 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 13 Jan 2026 13:54:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.01.2026 13:51, Oleksii Kurochko wrote:
> On 1/7/26 5:28 PM, Jan Beulich wrote:
>> On 24.12.2025 18:03, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/domain.h
>>> +++ b/xen/arch/riscv/include/asm/domain.h
>>> @@ -85,6 +85,22 @@ struct arch_vcpu
>>>       register_t vstval;
>>>       register_t vsatp;
>>>       register_t vsepc;
>>> +
>>> +    /*
>>> +     * VCPU interrupts
>>> +     *
>>> +     * We have a lockless approach for tracking pending VCPU interrupts
>>> +     * implemented using atomic bitops. The irqs_pending bitmap represent
>>> +     * pending interrupts whereas irqs_pending_mask represent bits changed
>>> +     * in irqs_pending.
>> And hence a set immediately followed by an unset is then indistinguishable
>> from just an unset (or the other way around).
> 
> I think it is distinguishable with the combination of irqs_pending_mask.

No. The set mask bit tells you that there was a change. But irqs_pending[]
records only the most recent set / clear.

>>   This may not be a problem, but
>> if it isn't, I think this needs explaining. Much like it is unclear why the
>> "changed" state needs tracking in the first place.
> 
> It is needed to track which bits are changed, irqs_pending only represents
> the current state of pending interrupts.CPU might want to react to changes
> rather than the absolute state.
> 
> Example:
>   - If CPU 0 sets an interrupt, CPU 1 needs to notice “something changed”
>     to inject it into the VCPU.
>   - If CPU 0 sets and then clears the bit before CPU 1 reads it,
>     irqs_pending alone shows 0, the transition is lost.

The fact there was any number of transitions is recorded in _mask[], yes,
but "the transition" was still lost if we consider the "set" in your
example in isolation. And it's not quite clear to me what's interesting
about a 0 -> 0 transition. (On x86, such a lost 0 -> 1 transition, i.e.
one followed directly by a 1 -> 0 one, would result in a "spurious
interrupt": There would be an indication that there was a lost interrupt
without there being a way to know which one it was.)

> By maintaining irqs_pending_mask, you can detect “this bit changed
> recently,” even if the final state is 0.
> 
> Also, having irqs_pending_mask allows to flush interrupts without lock:
> if ( ACCESS_ONCE(v->arch.irqs_pending_mask[0]) )
> {
> mask = xchg(&v->arch.irqs_pending_mask[0], 0UL);
> val = ACCESS_ONCE(v->arch.irqs_pending[0]) & mask;
> 
> *hvip &= ~mask;
> *hvip |= val;
> }
> Without it I assume that we should have spinlcok around access to 
> irqs_pending.

Ah yes, this would indeed be a benefit. Just that it's not quite clear to
me:

    *hvip |= xchg(&v->arch.irqs_pending[0], 0UL);

wouldn't require a lock either. What may be confusing me is that you put
things as if it was normal to see 1 -> 0 transitions from (virtual)
hardware, when I (with my x86 background) would expect 1 -> 0 transitions
to only occur due to software actions (End Of Interrupt), unless - see
above - something malfunctioned and an interrupt was lost. That (the 1 ->
0 transitions) could be (guest) writes to SVIP, for example.

Talking of which - do you really mean HVIP in the code you provided, not
VSVIP? So far I my understanding was that HVIP would be recording the
interrupts the hypervisor itself has pending (and needs to service).

>>> Our approach is modeled around multiple producer
>>> +     * and single consumer problem where the consumer is the VCPU itself.
>>> +     *
>>> +     * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
>>> +     * on RV32 host.
>>> +     */
>>> +#define RISCV_VCPU_NR_IRQS 64
>>> +    DECLARE_BITMAP(irqs_pending, RISCV_VCPU_NR_IRQS);
>>> +    DECLARE_BITMAP(irqs_pending_mask, RISCV_VCPU_NR_IRQS);
>>>   }  __cacheline_aligned;
>>>   
>>>   struct paging_domain {
>>> @@ -123,6 +139,9 @@ static inline void update_guest_memory_policy(struct 
>>> vcpu *v,
>>>   
>>>   static inline void arch_vcpu_block(struct vcpu *v) {}
>>>   
>>> +int vcpu_set_interrupt(struct vcpu *v, const unsigned int irq);
>>> +int vcpu_unset_interrupt(struct vcpu *v, const unsigned int irq);
>> Why the const-s?
> 
> As irq number isn't going to be changed inside these functions.

You realize though that we don't normally use const like this? This
use of qualifiers is meaningless to callers, and of limited meaning to
the function definition itself. There can be exceptions of course, when
it is important to clarify that a parameter must not change throughout
the function.

>>> --- a/xen/arch/riscv/include/asm/riscv_encoding.h
>>> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
>>> @@ -91,6 +91,7 @@
>>>   #define IRQ_M_EXT                 11
>>>   #define IRQ_S_GEXT                        12
>>>   #define IRQ_PMU_OVF                       13
>>> +#define IRQ_LOCAL_MAX              (IRQ_PMU_OVF + 1)
>> MAX together with "+ 1" looks wrong. What is 14 (which, when MAX is 14,
>> must be a valid interrupt)? Or if 14 isn't a valid interrupt, please use
>> NR or NUM.
> 
> I didn’t fully understand your idea. Are you suggesting having|IRQ_LOCAL_NR|?
> That sounds unclear, as it’s not obvious what it would represent.
> Using|MAX_HART| seems better, since it represents the maximum number allowed
> for a local interrupt. Any IRQ below that value is considered local, while
> values above it are implementation-specific interrupts.

Not quite. If you say "max", anything below _or equal_ that value is
valid / covered. When you say "num", anything below that value is
valid / covered. That is, "max" is inclusive for the upper bound of
the range, while "num" is exclusive. Hence my question whether 14 is
a valid local interrupt.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.