Xen project Mailing List

This email will be a technical description of what I, Felix Schmoll, as a Google Summer of Code student, did over summer.

=== Introduction / What is the project? ===

Fuzzing is a recent trend for systematic testing of interfaces by trying more or less random inputs and iterating over them. A subset of fuzzers uses code-

coverage as feedback when permuting and choosing inputs. The goal of this

project was to test the hypercall interface of Xen that way.

While this was overall a very comprehensive problem, and a full-fledged test suite similar to OSS-Test is a desirable overall goal, this was not realistic for the scope of this project. Instead, a generic mechanism to obtain feedback on code-coverage was implemented and the output processed in order to actually run a particular fuzzer (AFL). This way, the project helped to develop a better understanding of the problem space and will lay the foundation for possible future endeavours in that direction.

==== Implementation ===

== Overview ==

It was clear from the beginning that the American Fuzzy Lop, in the following referred to as AFL, was supposed to be run on the hypervisor. Being a user-space coverage-based fuzzer, it had to be ported in some way to the kernel. The first step of this was to allow it to somehow obtain feedback on the coverage from Xen by implementing a hypercall. Further, a mechanism was needed to actually execute the hypercalls from a domain other than dom0 (there are many ways to stop the hypervisor from dom0, and this was not what was supposed to be tested). This was done by what will be referred to as the executor.

== Implementation of the hypercall ==

The implementation of the hypercall to obtain code-coverage feedback was realised using the fsanitize-coverage=trace_pc feature of gcc-6. It inserts a customisable function at every basic block of the binary. This function was instrumented to write the current program counter to a particular domain-specific location, allowing tracing of individual domains.

Slowing down the hypervisor in normal operation, this was added as a compile-option and the hypercall returns an error code in case the edge's are disabled.

== Executor ==

As mentioned, there are many ways to stop the hypervisor from the control-domain (dom0), such that the hypercalls were supposed to be executed from a different domain. Being very lean and not making any hypercalls in the background (this would make the tracing indeterministic), XTF was chosen for this purpose. While one could have used an isolated domain for every test case, this would have required recompiling the test case for every run, thus being extremely slow, and adding little value overall. Instead, a server was programmed, running in an endless loop, taking binary information (test cases) from AFL, parsing them into hypercalls and executing them. The server also does some sanity checks, ignoring hypercalls that would kill the domain but not the hypervisor. It also informs the fuzzer once a hypercall ended and it is ready to receive a new test case.

== Fuzzer (American Fuzzy Lop) ==

The fuzzer is executed in dom0 as this allows an easier communication with XTF via xenconsole. For this purpose, the domain of the xtf-server needs to be passed to AFL on startup and it needs to run with super-user rights. Changes were made to AFL to pass the test cases to the xtf-server via xenconsole instead of to a user-space program via the command line, as it normally does. This is a quite fundamental change to the functioning of AFL, but the overall philosophy behind the design here was to keep the keep the changes to as few functions as possible.

=== Discussion of the fork-server ===

In testing it is desirable to have the exact same conditions for every iteration, which is why AFL in user-space mode starts a new process for every test case, speeding it up by using fork(). The equivalent in this scenario would be to fork a complete machine running a hypervisor. This would however require VM-forking and nested virtualization, both of which are currently not supported by Xen. In this scenario, a machine (the host) would be running Xen and AFL in a domain on top of it. Another domain would run a nested version of Xen and would execute a series of hypercalls as a test. This VM could be forked before the tests in order to always have an identical environment of Xen.

Instead, the setup developed in this project just runs a single hypervisor and executes all hypercalls consecutively as one single large test-case, while still passing the information to AFL as if these were completely isolated. This is weird to some extent, but the best currently supported compromise.

=== Deliverables ===

There were a minor patch for Xen [1] and a minor patch for XTF [2].

The actual hypercall meanwhile has not been merged, even though it conforms to the requirements as layed out initially. The build system should be updated to allow a better specification of what to trace before the hypercall becomes useful. It is instead, together with the changes to AFL and XTF, attached to this email.

The main patches are based on the following versions:

* Xen, commit 6c9abf0e8022807bb7d677570d0775659950ff1a

* AFL 2.43b

* XTF, commit 8956f82ce1321b89deda6895d58e5788d2198477

=== How to fuzz the hypercall interface ===

* Run the XTF-server [sudo xtf_dir/xtf-runner xtf-server] and detach from console [Ctrl-C] after the initialisation log ended (three lines of "Executing ...")

* Create (empty) directories for findings and testcases

* Disable AFL-forkserver [export AFL_NO_FORKSRV=1]

* Start AFL [sudo ./afl-fuzz -i ~/testcase_dir -o ~/findings_dir -r [DOMID of XTF-server] /some/unused/path]

You can find the domain-id of the xtf-server using [sudo xl list].

The test case also is configured such that it only tests as a pv64 (this was the only setup my hardware supports). It is possible that adjustments have to be made to run other modes.

=== Future work ===

No bugs were found so far, and it is quite possible that there aren't any without using more sophisticated fuzzing (i.e. valid buffers). Possible areas of improvement are the following:

* Minor usability improvements, like starting the XTF-server from within AFL

* Increase coverage

* Solve remaining problems with determinism

* Make XTF server more sophisticated, encode more information about hypercalls (e.g. pass valid buffers into hypercalls)

* Improve speed

* Improve stability (there still seem to be some files that shouldn't be compiled with tracing, although the stability is 100% for most hypercalls)

* Do the more complicated approach with complete hypervisor cloning

The patches for the hypercall, XTF and AFL will be sent in reply to this document.

=== References ===

Links to other documents:

* [1] xenconsole: Add option to xenconsole to always forward console input (https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=32e5bd5dcf6f45c2fc39d8d62b52b53d3e79ada7)

* [2] Implement pv_read_some (https://xenbits.xen.org/gitweb/?p=xtf.git;a=commit;h=34f052d41415cbd424f37c1a63a47464ce8f63e9)

* [3] The GSoC page of this project (https://summerofcode.withgoogle.com/projects/#5585891117498368)

* [4] Summary of the design session at the Xen summit (https://lists.xen.org/archives/html/xen-devel/2017-07/msg02138.html)

* [5] Design proposal for the hypercall (https://lists.xen.org/archives/html/xen-devel/2017-05/msg02210.html)

* [6] Design proposal for fuzzing the hypervisor (https://lists.xen.org/archives/html/xen-devel/2017-06/msg02924.html)

[Xen-devel] [GSoC] Fuzzing the Hypervisor