[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Event delivery and "domain blocking" on PVHv2
On 2020-06-15 17:03, Roger Pau Monné wrote: This way of event channel injection is slitgly hackish, and I would recommend using HVMOP_set_evtchn_upcall_vector, that way vectors will be properly routed using the lapic. Using HVM_PARAM_CALLBACK_TYPE_VECTOR vectors are injected without setting the IRR/ISR bit in the lapic registers. I picked HVM_PARAM_CALLBACK_TYPE_VECTOR since that seemed like the easiest option for a uniprocessor, PVH-only guest. What does <vector> mean in the context of HVMOP_set_evtchn_upcall_vector? If it's an APIC vector number (sorry, not too familiar with the post-legacy i8259 world), does that imply that if I use HVMOP_set_evtchn_upcall_vector I need to do some APIC initialisation / programming? All I need for Solo5/Mirage purposes is to have the upcall land on IDT vector #32 or above. Questions:1. Being based on the Solo5 virtio code, the low-level setup code is doing the "usual" i8259 PIC setup, to remap the PIC IRQs to vectors 32 and above.Should I be doing this initialisation for Xen PVH at all?Hm, there are no IO-APICs (or legacy PICs) on a PVH domU, so there's not much to route. If Solo5 is thinking it's somehow configuring them it's likely writing to some hole in memory, or to some RAM. Solo5 only has a very primitive understanding of hardware interrupts, so it's only configuring the legacy PICs via port IO. IO-APIC presence is signaled on the ACPI MADT table on PVH domU. Hmm, it'd be very unfortunate if I had to deal with ACPI, so here's hoping that I don't actually need to touch the APIC. I'm not interestedin using the PIC for anything, and all interrupts will be delivered via Xenevent channels.2. Related to the above, the IRQ handler code is ACKing the interrupt after the handler runs. Should I be doing that? Does ACKing "IRQ" 0 on the PIC have any interactions with Xen's view of event channels/pending upcalls?Which kind of ACking it's doing? Is it writing to the lapic EOI register? If so that would be OK when using HVMOP_set_evtchn_upcall_vector. If using HVM_PARAM_CALLBACK_TYPE_VECTOR there's nothing to Ack I think. Legacy PIC EOI via port IO. Next, for a PVHv2, uniprocessor only guest, is the following flow sufficientto unmask an event channel? struct shared_info *s = SHARED_INFO(); int pending = 0; atomic_sync_btc(port, &s->evtchn_mask[0]); pending = sync_bt(port, &s->evtchn_mask[0]);You should check for pending interrupts on evtchn_pending, not evtchn_mask. Ah, thanks for spotting that! Fixed, but just that change by itself does not seem to change the observed behaviour in any way. if (pending) { /* * Slow path: ** If pending is set here, then there was a race, and we lost the * upcall. Mask the port again and force an upcall via a call to* hyperspace. ** This should be sufficient for HVM/PVHv2 based on my understandingof * Linux drivers/xen/events/events_2l.c. */ atomic_sync_bts(port, &s->evtchn_mask[0]); hypercall_evtchn_unmask(port); }FWIW, I use the hypercall unconditionally on FreeBSD because I didn't see a performance difference when compared to this method.Lastly, the old PV-only Mini-OS based stack would do delays ("block thedomain") by doing a HYPERVISOR_set_timer_op(deadline) followed by a HYPERVISOR_sched_op(SCHEDOP_block,0 ). In the new code, I'm doing the following (based on what Mini-OS seems to be doing for HVM): solo5_time_t deadline = Int64_val(v_deadline); if (solo5_clock_monotonic() < deadline) { hypercall_set_timer_op(deadline); __asm__ __volatile__ ("hlt" : : : "memory"); /* XXX: cancel timer_op here if woken up early? */ } Again, is this the right thing to do for PVH?hlt will trap into the hypervisor, so it's fine to use. Thanks for confirming. As the comment says, do I need to cancel the timer_op?I'm not sure. Keep in mind that a new call to hypercall_set_timer_op will overwrite the previous timer, and hence should be fine I think as long as you are using the one-shot timer. Is there something other than a one-shot timer? hypercall_set_timer_op is just a single-argument hypercall with an uint64_t deadline, right? (And not documented in xen.h either ...) I understood thesemantics to be "fire once at/after the time deadline is reached", if thatis indeed the case then with my current VIRQ_TIMER handler which doesnothing in the interrupt context and has no side effects I should be fine.I have no idea how Solo5 works, maybe you should re-set the timer to the next deadline in the handler? Or that's fine because the timer is always set before blocking. Yes, it's always set before blocking, and we always unblock after the first interrupt (i.e. some event) is received, so should be fine. I can also post the code that does the actual demuxing of events from Xen (i.e. the upcall handler), but I've read through that several times now and I don't think the problem is there; adding diagnostic prints to both thelow-level C event channel code and higher-level OCaml Activations codeconfirms that received events are being mapped to their ports correctly.I can take a look at the event channel handler if you want, as you wish. The only weird think I've noticed is the error in the unmask where you seem to use evtchn_mask instead of evtchn_pending. Thanks for the offer, this stuff is fairly subtle.In the Mirage/Xen scenario, there are two parts to the upcall handler. The top half is executed in interrupt context and basically does nothing except acknowledge the upcall: int solo5__xen_evtchn_vector_handler(void *arg __attribute__((unused))) { struct vcpu_info *vi = VCPU0_INFO(); /** All work is done outside of interrupt context by evtchn_demux_pending(), * so all we need to do here is acknowledge the upcall from Xen. */ vi->evtchn_upcall_pending = 0; return 1; }The bottom half is then called periodically (and always before blocking) by the OCaml code: static bool evtchn_demux_pending(void) { struct shared_info *s = SHARED_INFO(); struct vcpu_info *vi = VCPU0_INFO(); bool some_pending = false; vi->evtchn_upcall_pending = 0; /* * Demux events received from Xen. * * pending_l1 is the "outer" per-VCPU selector (evtchn_pending_sel). * pending_l2 is the "inner" system-wide word (evtchn_pending[l1i]). */ xen_ulong_t pending_l1, pending_l2; atomic_sync_xchg(&vi->evtchn_pending_sel, 0, &pending_l1); while (pending_l1 != 0) { xen_ulong_t l1i = ffs(pending_l1); /** Masking pending_l2 with ~evtchn_mask[l1i] is necessary to get the * *current* masked events; otherwise races like the following * can occur: * * 1. X is generated, upcall scheduled by Xen. * 2. X is masked. * 3. Upcall is delivered. * 4. X fires despite now being masked. */ while ((pending_l2 =(s->evtchn_pending[l1i] & ~s->evtchn_mask[l1i])) != 0) { xen_ulong_t l2i = ffs(pending_l2);evtchn_port_t port = (l1i * (sizeof(xen_ulong_t) * 8)) + l2i; /* * Mark as pending on the OCaml side. */ evtchn_callback_ml[port] = 1; some_pending = true; atomic_sync_btc(l2i, &s->evtchn_pending[l1i]); } pending_l1 &= ~(1UL << l1i); } return some_pending; } I would also recommend using HVMOP_set_evtchn_upcall_vector instead of HVM_PARAM_CALLBACK_TYPE_VECTOR. In order to make some tools happy just set HVM_PARAM_CALLBACK_TYPE_VECTOR to 1. You can take a look at how Xen does it when running as a guest: http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/guest/xen/xen.c;h=2ff63d370a8a12fef166677e2ded7ed82a628ce8;hb=HEAD#l205 Thanks for the pointer. As I write above, if I can use HVMOP_set_evtchn_upcall_vector w/o doing too much "extra work" on the guest side then I will go with that. Roger. -mato
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |