Xen project Mailing List

Re: [PATCH] xen: Fix event channel callback via INTX/GSI

On 12/19/20 3:05 AM, David Woodhouse wrote: > On Fri, 2020-12-18 at 17:20 -0500, boris.ostrovsky@xxxxxxxxxx wrote: >> Are there other cases where this would be useful? If it's just to >> test a hypervisor under development I would think that people who are >> doing that kind of work are capable of building their own kernel. My >> concern is mostly about having yet another boot option that is of >> interest to very few people who can easily work around not having it. > For hypervisor testing we can just set the Xen major version number in > CPUID to 3, and that stops xs_reset_watches() from doing anything. > > cf. https://lkml.org/lkml/2017/4/10/266 > > Karim ripped out all this INTX code in 2017 because it was broken, and > subsequently put it back because it *was* working for older versions of > Xen, due to that "coincidence". The conclusion back then was that if it > was put back it should at least *work* consistently, and he was going > to send a patch "shortly". This is a that patch :) Right, I am not arguing about usefulness of the fix, only of the new boot option. > >>> Add a 'no_vector_callback' command line argument to test it. >> This last one should be a separate patch I think. > Could do. > >>> + /* >>> + * It doesn't strictly *have* to run on CPU0 but it sure >>> + * as hell better process the event channel ports delivered >>> + * to CPU0. >>> + */ >>> + irq_set_affinity(pdev->irq, cpumask_of(0)); >>> + >> >> Is the concern here that it won't be handled at all? > Indeed, the events don't get handled at all if the PCI interrupt lands > on a CPU other than zero. When the handler calls > xen_hvm_evtchn_do_upcall() that processes pending events for whichever > CPU it happens to be running on, and *not* the events pending for CPU0. > And the boot stops in xs_reset_watches() waiting (without a timeout) > for an interrupt that never gets processed, as before. Yes, I see. Then please do it in a separate patch. > >> And is this related to the issue this patch is addressing? > It is required to fix the event channel callback via INTX/GSI, yes. > Although it could reasonably be lifted out into a separate patch too. > >>> static int __init xenbus_probe_initcall(void) >>> { >>> - if (!xen_domain()) >>> - return -ENODEV; >>> - >>> - if (xen_initial_domain() || xen_hvm_domain()) >>> - return 0; >>> + /* >>> + * Probe XenBus here in the XS_PV case, and also XS_HVM unless we >>> + * need to wait for the platform PCI device to come up, which is >>> + * the (XEN_PVPVM && !xen_have_vector_callback) case. >>> + */ >>> + if (xen_store_domain_type == XS_PV || >>> + (xen_store_domain_type == XS_HVM && >>> + (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback))) >>> + xenbus_probe(); >>> >>> - xenbus_probe(NULL); >>> return 0; >>> } >>> - >>> device_initcall(xenbus_probe_initcall); >>> >>> +int xen_set_callback_via(uint64_t via) >>> +{ >>> + struct xen_hvm_param a; >>> + int ret; >>> + >>> + a.domid = DOMID_SELF; >>> + a.index = HVM_PARAM_CALLBACK_IRQ; >>> + a.value = via; >>> + >>> + ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a); >>> + if (ret) >>> + return ret; >>> + >>> + /* >>> + * If xenbus_probe_initcall() deferred the xenbus_probe() >>> + * due to the callback not functioning yet, we can do it now. >>> + */ >>> + if (!xenstored_ready && xen_store_domain_type == XS_HVM && >>> + IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback) >> >> I'd create an is_callback_ready() (or something with a better name) >> helper. > I pondered that, and indeed dropping the XVM/vector conditions and > doing it literally based on whether xen_set_callback_via() had been > called at all (and not too early). But it looks like there are cases > where Arm doesn't call xen_set_callback_via() at all, and it made more > sense to me to live xen_set_callback_via() to sit right here and have > those two conditions within a page of each other in juxtaposition, with > suitable comments. I think that's probably easier to understand and > work with than a "helper". OK. > >>> + xenbus_probe(); >>> + >>> + return ret; >>> +} >>> +EXPORT_SYMBOL_GPL(xen_set_callback_via); >>> + >>> /* Set up event channel for xenstored which is run as a local process >>> * (this is normally used only in dom0) >>> */ >>> @@ -817,11 +851,17 @@ static int __init xenbus_init(void) >>> break; >>> } >>> >>> - /* Initialize the interface to xenstore. */ >>> - err = xs_init(); >>> - if (err) { >>> - pr_warn("Error initializing xenstore comms: %i\n", err); >>> - goto out_error; >>> + /* >>> + * HVM domains may not have a functional callback yet. In that >>> + * case let xs_init() be called from xenbus_probe(), which will >>> + * get invoked at an appropriate time. >>> + */ >>> + if (xen_store_domain_type != XS_HVM) { >> >> Can we delay xs_init() for !XS_HVM as well? In other words wait until >> after PCI platform device has been probed (on HVM) and then call >> xs_init() for everyone. > We're half-way there already, because xenbus_probe() *does* happen > later as a device_initcall, and I've just made it call xs_init(). > > We could make it avoid calling xs_init() from xenbus_init() in the > XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it. Yes, that's along the lines of what I was thinking. > > But right now xenbus_probe() doesn't run for the other cases, so > there'd have to be a mode where it *only* calls xs_init() and doesn't > do the notifier chain. That seems like more churn that was needed so I > didn't do it. You think so? Yes, there would be a couple more places where you'd need to call xenbus_probe() but then you won't have to explain (with comments) why you call xs_init() here and not there and vice versa. It just looks to me a bit more complicated the way you do this but I suppose it's a matter of personal preference. -boris

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.