|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v3 05/16] xen/riscv: introduce tracking of pending vCPU interrupts, part 1
Based on Linux kernel v6.16.0.
Note that smp_wmb() is used instead of smp_mb__before_atomic() as what
we want to guarantee that if a bit in irqs_pending_mask is obversable
that the correspondent bit in irqs_pending is observable too.
Add lockless tracking of pending vCPU interrupts using atomic bitops.
Two bitmaps are introduced:
- irqs_pending — interrupts currently pending for the vCPU
- irqs_pending_mask — bits that have changed in irqs_pending
The design follows a multi-producer, single-consumer model, where the
consumer is the vCPU itself. Producers may set bits in
irqs_pending_mask without a lock. Clearing bits in irqs_pending_mask is
performed only by the consumer via xchg_acquire(). The consumer must not
write to irqs_pending and must not act on bits that are not set in the
mask. Otherwise, extra synchronization should be provided.
On RISC-V interrupts are not injected via guest registers, so pending
interrupts must be recorded in irqs_pending (using the new
vcpu_{un}set_interrupt() helpers) and flushed to the guest by updating
HVIP before returning control to the guest. The consumer side is
implemented in a follow-up patch.
A barrier between updating irqs_pending and setting the corresponding
mask bit in vcpu_set_interrupt()/vcpu_unset_interrupt() guarantees
that if the consumer observes a mask bit set, the corresponding pending
bit is also visible. This prevents missed interrupts during the flush.
It is possible a guest could have pending bit not result in the hardware
register without to be marked pending in irq_pending bitmap as:
According to the RISC-V ISA specification:
Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and
interrupt-enable bits for VS-level software interrupts. VSSIP in hip
is an alias (writable) of the same bit in hvip.
Additionally:
When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only
zeros. Else, vsip.SSIP and vsie.SSIE are aliases of hip.VSSIP and
hie.VSSIE.
This means the guest may modify vsip.SSIP, which implicitly updates
hip.VSSIP and the bit being writable with 1 would also trigger an interrupt
as according to the RISC-V spec:
These conditions for an interrupt trap to occur must be evaluated in a
bounded amount of time from when an interrupt becomes, or ceases to be,
pending in sip, and must also be evaluated immediately following the
execution of an SRET instruction or an explicit write to a CSR on which
these interrupt trap conditions expressly depend (including sip, sie and
sstatus).
What means that IRQ_VS_SOFT must be synchronized separately, what is done
in vcpu_sync_interrupts(). Note, also, that IRQ_PMU_OVF would want to be
synced for the similar reason as IRQ_VS_SOFT, but isn't sync-ed now as
PMU isn't supported now.
For the remaining VS-level interrupt types (IRQ_VS_TIMER and
IRQ_VS_EXT), the specification states they cannot be modified by the guest
and are read-only:
Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable
bits for VS-level external interrupts. VSEIP is read-only in hip, and is
the logical-OR of these interrupt sources:
• bit VSEIP of hvip;
• the bit of hgeip selected by hstatus.VGEIN; and
• any other platform-specific external interrupt signal directed to
VS-level.
Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable
bits for VS-level timer interrupts. VSTIP is read-only in hip, and is the
logical-OR of hvip.VSTIP and any other platform-specific timer interrupt
signal directed to VS-level.
Thus, for these interrupt types, it is sufficient to use vcpu_set_interrupt()
and vcpu_unset_interrupt(), and flush them during the call of
vcpu_flush_interrupts() (which is introduced in follow up patch).
vcpu_sync_interrupts(), which is called just before entering the VM,
slightly bends the rule that the irqs_pending bit must be written
first, followed by updating the corresponding bit in irqs_pending_mask.
However, it still respects the core guarantee that the producer never
clears the mask and only writes to irqs_pending if it is the one that
flipped the corresponding mask bit from 0 to 1.
Moreover, since the consumer won't run concurrently because
vcpu_sync_interrupts() and the consumer path are going to be invoked
equentially immediately before VM entry, it is safe to slightly relax
this ordering rule in vcpu_sync_interrupts().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
---
Changes in v3:
- Use smp_wb() instead of smp_mb__before_atomic().
- Add explanation of the change above in the commit message.
- Move vcpu_sync_interrupts() here to producers side.
- Introduce check_for_pcpu_work() to be clear from where vcpu_sync_interrupts()
is called.
---
Changes in V2:
- Move the patch before an introduction of vtimer.
- Drop bitmap_zero() of irqs_pending and irqs_pending_mask bitmaps as
vcpu structure starts out all zeros.
- Drop const for irq argument of vcpu_{un}set_interrupt().
- Drop check "irq < IRQ_LOCAL_MAX" in vcpu_{un}set_interrupt() as it
could lead to overrun of irqs_pending and irqs_pending_mask bitmaps.
- Drop IRQ_LOCAL_MAX as there is no usage for it now.
---
xen/arch/riscv/domain.c | 70 +++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/domain.h | 24 ++++++++++
xen/arch/riscv/traps.c | 8 ++++
3 files changed, 102 insertions(+)
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index af9586a4eb0d..4513f778cdc4 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,7 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/bitops.h>
#include <asm/cpufeature.h>
#include <asm/csr.h>
#include <asm/riscv_encoding.h>
@@ -124,3 +125,72 @@ void arch_vcpu_destroy(struct vcpu *v)
{
vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE);
}
+
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq)
+{
+ /*
+ * We only allow VS-mode software, timer, and external
+ * interrupts when irq is one of the local interrupts
+ * defined by RISC-V privilege specification.
+ */
+ if ( irq != IRQ_VS_SOFT &&
+ irq != IRQ_VS_TIMER &&
+ irq != IRQ_VS_EXT )
+ return -EINVAL;
+
+ set_bit(irq, v->arch.irqs_pending);
+ smp_wmb();
+ set_bit(irq, v->arch.irqs_pending_mask);
+
+ vcpu_kick(v);
+
+ return 0;
+}
+
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq)
+{
+ /*
+ * We only allow VS-mode software, timer, external
+ * interrupts when irq is one of the local interrupts
+ * defined by RISC-V privilege specification.
+ */
+ if ( irq != IRQ_VS_SOFT &&
+ irq != IRQ_VS_TIMER &&
+ irq != IRQ_VS_EXT )
+ return -EINVAL;
+
+ clear_bit(irq, v->arch.irqs_pending);
+ smp_wmb();
+ set_bit(irq, v->arch.irqs_pending_mask);
+
+ return 0;
+}
+
+void vcpu_sync_interrupts(struct vcpu *v)
+{
+ unsigned long hvip;
+
+ /* Read current HVIP and VSIE CSRs */
+ v->arch.vsie = csr_read(CSR_VSIE);
+
+ /* Sync-up HVIP.VSSIP bit changes done by Guest */
+ hvip = csr_read(CSR_HVIP);
+ if ( (v->arch.hvip ^ hvip) & BIT(IRQ_VS_SOFT, UL) &&
+ !test_and_set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending_mask) )
+ {
+ if ( hvip & BIT(IRQ_VS_SOFT, UL) )
+ set_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+ else
+ clear_bit(IRQ_VS_SOFT, &v->arch.irqs_pending);
+ }
+
+ /*
+ * Sync-up AIA high interrupts.
+ *
+ * It is necessary to do only for CONFIG_RISCV_32 which isn't supported
+ * now.
+ */
+#ifdef CONFIG_RISCV_32
+# error "Update v->arch.vsieh"
+#endif
+}
diff --git a/xen/arch/riscv/include/asm/domain.h
b/xen/arch/riscv/include/asm/domain.h
index 6bb06a50c6ab..8d9432ec5a8b 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -54,6 +54,25 @@ struct arch_vcpu {
register_t henvcfg;
register_t hideleg;
register_t hstateen0;
+ register_t hvip;
+
+ register_t vsie;
+
+ /*
+ * VCPU interrupts
+ *
+ * We have a lockless approach for tracking pending VCPU interrupts
+ * implemented using atomic bitops. The irqs_pending bitmap represent
+ * pending interrupts whereas irqs_pending_mask represent bits changed
+ * in irqs_pending. Our approach is modeled around multiple producer
+ * and single consumer problem where the consumer is the VCPU itself.
+ *
+ * DECLARE_BITMAP() is needed here to support 64 vCPU local interrupts
+ * on RV32 host.
+ */
+#define RISCV_VCPU_NR_IRQS 64
+ DECLARE_BITMAP(irqs_pending, RISCV_VCPU_NR_IRQS);
+ DECLARE_BITMAP(irqs_pending_mask, RISCV_VCPU_NR_IRQS);
};
struct paging_domain {
@@ -92,6 +111,11 @@ static inline void update_guest_memory_policy(struct vcpu
*v,
static inline void arch_vcpu_block(struct vcpu *v) {}
+int vcpu_set_interrupt(struct vcpu *v, unsigned int irq);
+int vcpu_unset_interrupt(struct vcpu *v, unsigned int irq);
+
+void vcpu_sync_interrupts(struct vcpu *v);
+
#endif /* ASM__RISCV__DOMAIN_H */
/*
diff --git a/xen/arch/riscv/traps.c b/xen/arch/riscv/traps.c
index c81a4f79a0d2..82e1dc59cdea 100644
--- a/xen/arch/riscv/traps.c
+++ b/xen/arch/riscv/traps.c
@@ -169,6 +169,11 @@ static void do_unexpected_trap(const struct cpu_user_regs
*regs)
die();
}
+static void check_for_pcpu_work(void)
+{
+ vcpu_sync_interrupts(current);
+}
+
void do_trap(struct cpu_user_regs *cpu_regs)
{
register_t pc = cpu_regs->sepc;
@@ -222,6 +227,9 @@ void do_trap(struct cpu_user_regs *cpu_regs)
do_unexpected_trap(cpu_regs);
break;
}
+
+ if ( cpu_regs->hstatus & HSTATUS_SPV )
+ check_for_pcpu_work();
}
void vcpu_show_execution_state(struct vcpu *v)
--
2.52.0
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |