[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] virtio-mmio: add xenbus probing


  • To: Teddy Astie <teddy.astie@xxxxxxxxxx>, "Michael S. Tsirkin" <mst@xxxxxxxxxx>, Jason Wang <jasowang@xxxxxxxxxx>, Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>, Eugenio Pérez <eperezma@xxxxxxxxxx>
  • From: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
  • Date: Thu, 30 Apr 2026 15:50:05 -0300
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=fm2 header.d=invisiblethingslab.com header.i="@invisiblethingslab.com" header.h="Cc:Content-Transfer-Encoding:Content-Type:Date:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To"; dkim=pass header.s=fm2 header.d=messagingengine.com header.i="@messagingengine.com" header.h="Cc:Content-Transfer-Encoding:Content-Type:Date:Feedback-ID:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To:X-ME-Proxy:X-ME-Sender"
  • Autocrypt: addr=val@xxxxxxxxxxxxxxxxxxxxxx; keydata= xm8EaFTEiRMFK4EEACIDAwQ+qzawvLuE95iu+QkRqp8P9z6XvFopWtYOaEnYf/nE8KWCnsCD jz82tdbKBpmVOdR6ViLD9tzHvaZ1NqZ9mbrszMXq09VfefoCfZp8jnA2yCT8Y4ykmv6902Ne NnlkVwrNKFZhbCBQYWNrZXR0IDx2YWxAaW52aXNpYmxldGhpbmdzbGFiLmNvbT7CswQTEwkA OxYhBAFMrro+oMGIFPc7Uc87uZxqzalRBQJoVMSJAhsDBQsJCAcCAiICBhUKCQgLAgQWAgMB Ah4HAheAAAoJEM87uZxqzalRlIIBf0cujzfSLhvib9iY8LBh8Tirgypm+hJHoY563xhP0YRS pmqZ6goIuSGpEKcW5mV3egF/TLLAOjsfroWae4giImTVOJvLOsUycxAP4O5b1Qiy+cCGsHKA nCRzrvqnPkyf4OeRznMEaFTEiRIFK4EEACIDAwSffe3tlMmmg3eKVp7SJ+CNZLN0M5qzHSCV dBBkIVvEJo+8SDg4jrx/832rxpvMCz2+x7+OHaeBHKafhOWUccYBLKqV/3nBftxCkbzXDbfY d02BY9H4wBIn0Y3GnwoIXRgDAQkJwpgEGBMJACAWIQQBTK66PqDBiBT3O1HPO7mcas2pUQUC aFTEiQIbDAAKCRDPO7mcas2pUaptAX9f7yUJLGU4C6XjMJvXd8Sz6cGTyxkngPtUyFiNqtad /GXBi3vHKYNfSrdqJ8wmZ8MBgOqWaaa1wE4/3qZU8d4RNR8mF7O40WYK/wdf1ycq1uGad8PN UDOwAqdfvuF3w8QMPw==
  • Cc: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Viresh Kumar <viresh.kumar@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, virtualization@xxxxxxxxxxxxxxx
  • Delivery-date: Thu, 30 Apr 2026 18:50:36 +0000
  • Feedback-id: i001e48d0:Fastmail
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 4/30/26 10:47 AM, Teddy Astie wrote:
Le 30/04/2026 à 10:51, Val Packett a écrit :
On 4/30/26 5:11 AM, Teddy Astie wrote:
Le 30/04/2026 à 06:06, Val Packett a écrit :
[..]
I'd like to get some early feedback for this patch, particularly
the general stuff:

* is this whole thing acceptable in general?
* should it be extracted into a different file?
* (from the Xen side) any input on the xenstore keys, what goes where?
* anything else to keep in mind?

It does seem simple enough, so hopefully this can be done?

The corresponding userspace-side WIP is available at:
https://github.com/QubesOS/xen-vhost-frontend

And the required DMOP for firing the evtchn events will be sent
to xen-devel shortly as well.
Could that be done through evtchn_send (or its userland counterpart) ?
Actually, yes… The use of DMOPs is only dictated by the current Linux
privcmd.c code (the irqfds created by the kernel react to events by
executing HYPERVISOR_dm_op with a stored operation), we can avoid the
need to modify Xen by simply expanding the privcmd driver to make
"evtchn fds". Sounds good, will do.

Given that the event channel used by device models is exposed through
ioreq.vp_eport ("evtchn for notifications to/from device model"). I
don't think you need to expand the privcmd interface, and you should be
able to do this instead :

open /dev/xen/evtchn
perform IOCTL_EVTCHN_BIND_INTERDOMAIN (for each guest vCPU)
     with remote_domain=guest_domid, remote_port=ioreq.vp_eport

Then interact with the event channel through IOCTL_EVTCHN_NOTIFY (with
local port given by IOCTL_EVTCHN_BIND_INTERDOMAIN) and read/write on the
file descriptor.
So the reason there's currently an ioctl to bind an eventfd to fire a
stored DMOP is that the whole idea is to (efficiently!) support generic,
hypervisor-neutral device server implementations via the vhost-user
protocol.

Now of course, the current implementation isn't *entirely* hypervisor-
neutral as e.g. the vm-memory Rust crate (inside of the "neutral" vhost-
user device servers) does need to be built with the `xen` feature. But
still, that's how it works. What can be made generic is generic.

xen-vhost-frontend, which is the thing that integrates these with Xen,
actually used to handle the interrupts in userspace[1] by firing the
DMOP itself (which is where I could "just replace that with
IOCTL_EVTCHN_NOTIFY") but that was offloaded to the kernel with the
introduction of IOCTL_PRIVCMD_IRQFD[2], similarly to KVM_IRQFD.

I think what would be preferable for your usecase would be to have a way
to bind a event channel with a eventfd object, which should be a
primitive that lives in the evtchn device.

Yeah, it would be an ioctl on the evtchn device, definitely. I wasn't being exact when I said "extend privcmd", sorry. I just meant "handling it on the Linux side" generally!

The current interface kinda assume that you're looking to emulate a
completely emulated virtio device with no Xen specifics, it looks like
it's not exactly what you're implementing.
It's already implemented, and I'm not looking to change it much, just to make it work on x86_64. The only thing that wasn't already compatible was firing the host-to-guest interrupt, because on x86_64 we don't have anything like the (v)GIC with its massive arbitrary IRQ number space. Event channels are the only way to interrupt a PVH guest, hence using xenbus in the guest to provision the device.
As you actually plan to switch to using event channels for notifying the
guest, I think it would be preferable to do the same the other way
(event channels to notify the host) so you only have event channels to
worry about here.

The other direction is already implemented perfectly well in IOCTL_PRIVCMD_IOEVENTFD. The MMIO area is set up like so:

- ioreq is mapped with IOCTL_PRIVCMD_MMAP_RESOURCE(XENMEM_resource_ioreq_server, ..); - vp_eport event channels (per cpu) are bound to the current domain via IOCTL_EVTCHN_BIND_INTERDOMAIN; - those are passed, along with the ioreq page itself, to IOCTL_PRIVCMD_IOEVENTFD to get an eventfd that fires when a virtqueue is ready; - which is an eventfd that xen-vhost-frontend passes to the vhost-user device server.

So for this direction, it's not a 1:1 mapping but rather a specific contraption designed to efficiently handle this use case:

- when an ioreq event channel (for any of the vcpus) fires,
- the kernel handler (ioeventfd_interrupt) checks if it's specifically an IOREQ_WRITE write to the VIRTIO_MMIO_QUEUE_NOTIFY offset, - and if so, it signals the eventfd for any virtqueue that has new data (waking the generic device server which has the eventfd, so bypassing xen-vhost-frontend), pings the guest back via evtchn, and returns IRQ_HANDLED; - otherwise the request is handled in userspace by xen-vhost-frontend (virtio configuration register access).

It just works :)

~val




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.