[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 4/4] docs/sphinx: Technical Debt
This identifies various of areas technical debt, which either need to be, or are being worked on, along with enough clarifying details for people to follow. Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> --- CC: Lars Kurth <lars.kurth@xxxxxxxxxx> CC: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> CC: Ian Jackson <ian.jackson@xxxxxxxxxx> CC: Jan Beulich <JBeulich@xxxxxxxx> CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx> CC: Tim Deegan <tim@xxxxxxx> CC: Wei Liu <wl@xxxxxxx> CC: Julien Grall <julien@xxxxxxx> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> --- docs/conf.py | 11 +++- docs/index.rst | 8 +++ docs/misc/tech-debt.rst | 130 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 148 insertions(+), 1 deletion(-) create mode 100644 docs/misc/tech-debt.rst diff --git a/docs/conf.py b/docs/conf.py index 50e41501db..0d2227f52e 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -53,7 +53,7 @@ # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. -extensions = [] +extensions = ['sphinx.ext.extlinks'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] @@ -192,3 +192,12 @@ # A list of files that should not be packed into the epub file. epub_exclude_files = ['search.html'] + + +# -- Configuration for external links ---------------------------------------- + +extlinks = { + 'xen-cs': + ('https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=%s', + 'Xen c/s '), +} diff --git a/docs/index.rst b/docs/index.rst index b8ab13178c..0a2af2db9d 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -59,3 +59,11 @@ Miscellanea .. toctree:: glossary + +Unsorted +-------- + +.. toctree:: + :maxdepth: 2 + + misc/tech-debt diff --git a/docs/misc/tech-debt.rst b/docs/misc/tech-debt.rst new file mode 100644 index 0000000000..172ba3bd51 --- /dev/null +++ b/docs/misc/tech-debt.rst @@ -0,0 +1,130 @@ +.. SPDX-License-Identifier: CC-BY-4.0 + +Technical Debt +============== + +Hypervisor +---------- + +CONFIG_PDX +~~~~~~~~~~ + +Xen uses the term MFN for Machine Frame Number, which is synonymous with +Linux's PFN, and maps linearly to system/host/machine physical addresses. + +For every page of RAM, a ``struct page_info`` is needed for tracking purposes. +In the simple case, the frametable is an array of ``struct page_info[]`` +indexed by MFN. + +However, this is inefficient when a system has banks of RAM at spread out in +address space, as a large amount of space is wasted on frametable entries for +non-existent frames. This wastes both virtual address space and RAM. + +As a consequence, Xen has a compression scheme known as PDX which removes +unused bits out of the middle of MFNs, to make a more tightly packed Page +inDeX, which in turn reduces the size of the frametable for system. + +At the moment, PDX compression is unconditionally used. + +However, PDX compression does come with a cost in terms of the complexity to +convert between PFNs and pages, which is a common operation in Xen. + +Typically, ARM32 systems do have RAM banks in discrete locations, and want to +use PDX compression, while typically ARM64 and x86 systems have RAM packed +from 0 with no holes. + +The goal of this work is to have ``CONFIG_PDX`` selected by ARM32 only. This +requires slightly untangling the memory management code in ARM and x86 to give +it a clean compile boundary where PDX conversions are used. + + +Waitqueue infrastructure +~~~~~~~~~~~~~~~~~~~~~~~~ + +Livepatching safety in Xen depends on all CPUs rendezvousing on the return to +guest path, with no stack frame. The vCPU waitqueue infrastructure undermines +this safety by copying a stack frame sideways, and ``longjmp()``\-ing away. + +Waitqueues are only used by the introspection/mem_event/paging infrastructure, +where the design of the rings causes some problems. There is a single 4k page +used for the ring, which serves both synchronous requests, and lossless async +requests. In practice, introspecting an 11-vcpu guest is sufficient to cause +the waitqueue infrastructure to start to be used. + +A better design of ring would be to have a slot per vcpu for synchronous +requests (simplifies producing and consuming of requests), and a multipage +ring buffer (of negotiable size) with lossy semantics for async requests. + +A design such as this would guarantee that Xen never has to block waiting for +userspace to create enough space on the ring for a vcpu to write state out. + +.. note:: + + There are other aspects of the existing ring infrastructure which are + driving a redesign, but these don't relate directly to the waitqueue + infrastructure and livepatching safety. + + The most serious problem is that the ring infrastructure is GFN based, + which leaves the guest either able to mess with the ring, or a shattered + host superpage where the ring used to be, and the guest balloon driver able + to prevent the introspection agent from connecting/reconnecting the ring. + +As there are multiple compelling reasons to redesign the ring infrastructure, +the plan is to introduce the new ring ABI, deprecate and remove the old ABI, +and simply delete the waitqueue infrastructure at that point, rather than try +to redesign livepatching from scratch in an attempt to cope with unwinding old +stack frames. + + +Dom0 +---- + +Remove xenstored's dependencies on unstable interfaces +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Various xenstored implementations use libxc for two purposes. It would be a +substantial advantage to move xenstored onto entirely stable interfaces, which +disconnects it from the internal of the libxc. + +1. Foreign mapping of the store ring + + This is obsolete since :xen-cs:`6a2de353a9` (2012) which allocated grant + entries instead, to allow xenstored to function as a stub-domain without dom0 + permissions. :xen-cs:`38eeb3864d` dropped foreign mapping for cxenstored. + However, there are no OCaml bindings for libxengnttab. + + Work Items: + + * Minimal ``tools/ocaml/libs/xg/`` binding for ``tools/libs/gnttab/``. + * Replicate :xen-cs:`38eeb3864d` for oxenstored as well. + +2. Figuring out which domain(s) have gone away + + Currently, the handling of domains is asymmetric. + + * When a domain is created, the toolstack explicitly sends an + ``XS_INTRODUCE(domid, store mfn, store evtchn)`` message to xenstored, to + cause xenstored to connect to the guest ring, and fire the + ``@introduceDomain`` watch. + + * When a domain is destroyed, Xen fires ``VIRQ_DOM_EXC`` which is bound by + xenstored, rather than the toolstack. xenstored updates its idea of the + status of domains, and fires the ``@releaseDomain`` watch. + + Xenstored uses ``xc_domain_getinfo()``, to work out which domain(s) have gone + away, and only cares about the shutdown status. + + Furthermore, ``@releaseDomain`` (like ``VIRQ_DOM_EXC``) is a single-bit + message, which requires all listeners to evaluate whether the message applies + to them or not. This results in a flurry of ``xc_domain_getinfo()`` calls + from multiple entities in the system, which all serialise on the domctl lock + in Xen. + + Work Items: + + * Figure out how shutdown status can be expressed in a stable way from Xen. + * Figure out if ``VIRQ_DOM_EXC`` and ``@releaseDomain`` can be extended + or superseded to carry at least a domid, to make domain shutdown scale + better. + * Figure out if ``VIRQ_DOM_EXC`` would better be bound by the toolstack, + rather than xenstored. -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |