We wanted to brought up this small proposal regarding the lack of
parameterization on PV devices on Xen.

Currently users don't have a way for enforce and control what
features/queues/etc the backend provides. So far there's only global parameters
on backends, and specs do not mention anything in this regard.

The most obvious example is netback/blkback max_queues module parameter where it
sets the limit the maximum queues for all devices which is not that flexible.
Other examples include controlling offloads visible by the NIC (e.g. disabling
checksum offload, disabling scather-gather), others more about I/O path (e.g.
disable blkif indirect descriptors, limit number of pages for the ring), or less
grant usage by minimizing number of queues/descriptors.

Of course there could be more examples, as this seems to be ortoghonal to the
kinds of PV backends we have. And seems like all features appear to be published
on the same xenbus state?

The idea to address this would be very simple:

- Toolstack when initializing device paths, writes additional entries in the
form of 'request-<feature-name>' = <feature-value>. These entries are only
visible by the backend and toolstack;

- Backend reads this entries and uses <feature-value> as the value of
<feature-name>, which will then be visible on the frontend.

[ Removal of the 'request-*' xenstore entries could represent a feedback look
  that the backend indeed read and used the value. Or else it could simply be
  ignored. ]

And that's it.

In pratice user would do: E.g.

name = "guest"
kernel = "bzImage"
vif = ["bridge=br0,queues=2"]
disk = [

Toolstack writes:

/local/domain/0/backend/vif/8/0/request-multi-queue-max-queues = 2
/local/domain/0/backend/vbd/8/51713/request-multi-queue-max-queues = 2
/local/domain/0/backend/vbd/8/51713/request-max-ring-page-order = 0

Backends reads and seeds with (and assuming it passes backend validation ofc):

/local/domain/0/backend/vif/8/0/multi-queue-max-queues = 2
/local/domain/0/backend/vbd/8/51713/multi-queue-max-queues = 2
/local/domain/0/backend/vbd/8/51713/max-ring-page-order = 0

The XL configuration entry for controlling these tunable are just examples it's
not clear the general preference for this. An alternative could be:

vif = ["bridge=br0,features=queues:2\\;max-ring-page-order:0"]

Which lets us have more generic feature control, without sticking to particular
features names.

Naturally libvirt could be a consumer of this (as it already has the 'queues'
and host 'tso4', 'tso6', etc in their XML schemas)

Thoughts? Do folks think the correct way of handling this?


[0] https://github.com/qemu/qemu/blob/master/hw/net/virtio-net.c#L2102

