Xen project Mailing List

Re: [Xen-devel] Critique of the Xen Security Process

To: qubes-devel <qubes-devel@xxxxxxxxxxxxxxxx>

Date: Sat, 7 Nov 2015 08:51:14 -0800 (PST)

Cc: joanna@xxxxxxxxxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Sun, 08 Nov 2015 08:23:25 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

This is scary, but unfortunately, it's commonplace.Â There is a sad mentality of not including basic checks because it will generate extra instructions that will take more microseconds to execute.Â That may have been a concern when we were running 1MHz 8-bit processors, but now that code takes such an insignificant amount of time to execute that it doesn't matter.Â Coding should no longer be about making it as fast and efficient as possible -- it needs to be about making it stable and secure.

b. The XSA-123 [6] was another critical security bug in Xen, this time resulting
from one of the hypervisor developer's fetish to use an absolutely confusing
construct in order to save a few modest bytes in a structure which might have
been allocated by the system maybe a few tens of times at best. Even more
worrying was the way how Xen Security Team decided to fix the bug: again by
modifying some condition in the code further up the execution path, with the
hopes that this time they would ensure this puzzling construct would always be
used properly. We wrote more about this in our QSB #18 [7].

c. Finally, the way how Xen fixed the recent XSA-148 looks also very reactive,
IMHO. With a bug of this calibre, I would expect Xen to carefully review and
augment all its PV memory virtualization code with additional checks (ASSERTs),
ensuring certain invariants are always satisfied. Such as e.g. that none of the
pages containing PDEs or PTEs are becoming writeable by the VM.

I can't help but have a feeling that some of the Xen developers seem to be
overconfident in their belief they can fully understand all the possible
execution paths in their code. Well, the XSAs quoted above are an indisputable
prove that this is not quite always the case. Realizing that, each developer by
themselves, might be a great step towards a more secure hypervisor...

2. Another security-related aspect of the Xen project is how it totally ignores
problems related to the build process security. Those who don't believe me
should grep the sources for wget, which is now disguised as "FETCHER" shell
variable... (so grep for "FETCHER" string)

I feel embarrassed that I need to explain, at the end of 2015, why the build
process of any serious software project should not blindly download unsigned
components (sources) from the Internet, especially if it is about to execute
Makefiles from these components a moment later... Come on, guys!

(Of course we have been forced to get around this gapping security whole in
Qubes OS [8] ourselves, sadly with a method that is not well suited for
upstreaming).

Maybe it's time to split off the code into a new branch specifically for Qubes, and pick and choose what gets pulled over from their code?
Â

3. Another thing is, of course: stop adding features to the core hypervisor. We
really need Xen to finally mature, stabilize, and for its development process to
be slowing down over time (just the bug fixes). We need a long-term-supported
hypervisor, which doesn't change with subsonic speed. This would allow this core
code to be widely audited by many experts. If some users want features, these
should perhaps be maintained as additional modules (no, I don't mean dynamically
loaded modules, just compile-time included), preferably in separate repos.

Perhaps also to move all the non-hypervisor code, such as all the toolstacks,
stubdom, etc, into separate repos also. For hygiene, if for nothing else.

Admittedly, some of the features are a result of hardware evolution, such as
e.g. UEFI support. But many are not. Again, maintaining these as optional code
(in separate repos) would be a great step into getting the hypervisor maturing,
finally.

I have already written about it years ago [9], as a matter of fact.

4. Finally, I've been really surprised by the line of reasoning Ian expressed in
the above-mentioned blog post. TL;DR: "we're still doing pretty great, compared
to other projects, because: 1) we have smaller number of publicly disclosed
bugs, and 2) we actually publicly disclose these bugs which we are aware of".

The attitude presented in the blog post is so wrong, that I'm not even sure
where to start commenting on this...

The bigger question is how much testing is really being done to find vulnerabilities in the first place.Â Xen certainly isn't Windows with a base of hundreds of millions and thousands trying to rip it apart and break it.Â It sounds like we'll be in for more surprises to come.
Â

With a single bug like the XSA-148 which, let me repeat that one more time: had
been present in the hypervisor for the last 7 years, so with a bug like this it
really doesn't matter how many (i.e. what number) of critical bugs does the
competition have. Because only one bug of this calibre is enough for the
attacker to never really bother to find another one. The mere fact that
competing hypervisors might got 12 bugs during the same period, really doesn't
make Xen look any better, sorry.

Also, there is really nothing to be proud that you disclose the bugs. It would
be a problem if you didn't.

Hope the above comments might help improve the Xen security. Perhaps some would
perceive them as arrogant or rude. Too bad. Remember the actual attackers will
not be arrogant or rude -- they will just come and exploit bugs, silently.
Admittedly this might not hurt some of the developers ego, not in the short time
at least.

Can we, the Qubes OS project, or myself personally, help with implementing the
above suggestions? Sadly, no. While some of us do contribute occasional patches
to Xen (specifically Marek Marczykowski-GÃrecki), we really work for a different
project and have different tasks and responsibilities.

Regards,
joanna.

[1] http://xenbits.xen.org/xsa/advisory-148.html
[2] https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-022-2015.txt
[3] https://blog.xenproject.org/2015/10/30/security-vs-features/
[4] https://twitter.com/xen_org/status/660151720463482880
[5] http://xenbits.xen.org/xsa/advisory-109.html
[6] http://xenbits.xen.org/xsa/advisory-123.html
[7] https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-018-2015.txt
[8] https://github.com/QubesOS/qubes-vmm-xen/commit/dcd6c0a4f2c6226a9b706e62469d420579c86975
[9] http://lists.xen.org/archives/html/xen-devel/2013-09/msg01815.html
-----BEGIN PGP SIGNATURE-----

iQIcBAEBAgAGBQJWPOHUAAoJEDOT2L8N3GcYzpEP/An/PTnKDOaC4tPKw3+Y8VIL
n/xIfRPRnPRy18Tbx6EnrKzgohtVvtigtmd/FIxjYVuZ3Luw2B4RFSqUENg758Aa
ANLs4kUD+yaUO82Jfg1nq/6PXBNZlKFovQuYV20LEW9JV6DvMCbzYJ2evZ6t0XS/
EAhUOP1OqY4vb0kah4dwhQKepqwPcD5Tm5LLZn/qbO30e2zN9MkKB851vguQtVIz
k5I8pv+MSQp1efRG2eg470onGtU36IIYFsY1OLihJA9MYh+74FpIA1xoURenJg6+
NJhXEDnxxlz78BJaGOiSwHwB59yd2DXDJKAaUNV/H1LqQu3o1ED+8IZWUARkc0Wl
ckTfQz/++exDhyRcWVHF5GnxEHWdyu/gNZOCNjl4o4HiYS4SQrhTRn7rWwalbyBB
/jG3bAnU8m/Gtp95FtuWXCwuKeeOeBSfnxKMrksxu3JFSNevsYPZu5lrdtUEGLZA
97SwLj70GLesvMqEV3k7XmrQyt8LwyBzLCm+cCocaPEmOQAymeuslrs/RehjGSCQ
L34Ipjvg85GoND64N8X56NuvD+/LrteRhp8hS+aEWv2YpNVJB9tmcOTzLzf7faPK
KPyqa2lW7XKwm7n0WCbtPnrVHRbFPbKvkJRDnfPDwEAEiZcj0SjJ8h7fIDSQ4qac
vgYBTJr/2cRrtC4y1An5
=zQmw
-----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

Recently Xen has released the XSA-148 advisory [1] addressing a fatal bug in the
hypervisor. The bug has been lurking there for the last 7 years! We, the Qubes
OS Project, have commented on this in our Security Bulletin #22 [2]. And far
from enthusiastic commentary that was (FWIW, it was me who wrote this QSB, as
evidenced in the commits log, in case some from the Xen community would like to
direct their rage towards a particular human being ;) Ian Jackson then wrote a
response on the Xen blog [3]. I was then asked to share some more thoughts about
how I thought Xen could actually improve its security process [4]. So, I share
some these below:

1. First of all, I wish Xen was somehow more defensively coded. To provide some
examples:

a. In XSA-109 [5] there was a problem with the hypervisor dereferencing a NULL
pointer. The problem was fixed by the Xen Security Team by applying a patch
which (hopefully) made sure the execution path that lead to this NULL pointer
dereferencing code was never taken. Back then I suggested (on the Xen
pre-disclosure list) to make this patch more explicit though:

> On Wed, Jan 21, 2015 at 02:31:51PM +0100, Joanna Rutkowska wrote: Â Â Â Â Â Â Â Â
> (...)
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> Wouldn't it be prudent to also check if: Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> (v->arch.paging.mode>{write_guest_entry,cmpxchg_guest_entry} != NULL) Â Â Â Â Â
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> ... in the two affected functions, just before derefing these function Â Â Â Â
>> pointers? Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> Going even a step further: how about replacing all Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>> function-pointer-based calls with macros that first validates the Â Â Â Â Â Â Â
>> pointer before derefing it? At least when the system doesn't have SMEP? Â Â Â Â
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â

...to which I got a reply from one of the Xen Security Team engineers that the
above might perhaps be justified in debug builds only, followed by a standard:
"feel free to contribute a patch".

b. The XSA-123 [6] was another critical security bug in Xen, this time resulting
from one of the hypervisor developer's fetish to use an absolutely confusing
construct in order to save a few modest bytes in a structure which might have
been allocated by the system maybe a few tens of times at best. Even more
worrying was the way how Xen Security Team decided to fix the bug: again by
modifying some condition in the code further up the execution path, with the
hopes that this time they would ensure this puzzling construct would always be
used properly. We wrote more about this in our QSB #18 [7].

c. Finally, the way how Xen fixed the recent XSA-148 looks also very reactive,
IMHO. With a bug of this calibre, I would expect Xen to carefully review and
augment all its PV memory virtualization code with additional checks (ASSERTs),
ensuring certain invariants are always satisfied. Such as e.g. that none of the
pages containing PDEs or PTEs are becoming writeable by the VM.

I can't help but have a feeling that some of the Xen developers seem to be
overconfident in their belief they can fully understand all the possible
execution paths in their code. Well, the XSAs quoted above are an indisputable
prove that this is not quite always the case. Realizing that, each developer by
themselves, might be a great step towards a more secure hypervisor...

2. Another security-related aspect of the Xen project is how it totally ignores
problems related to the build process security. Those who don't believe me
should grep the sources for wget, which is now disguised as "FETCHER" shell
variable... (so grep for "FETCHER" string)

I feel embarrassed that I need to explain, at the end of 2015, why the build
process of any serious software project should not blindly download unsigned
components (sources) from the Internet, especially if it is about to execute
Makefiles from these components a moment later... Come on, guys!

(Of course we have been forced to get around this gapping security whole in
Qubes OS [8] ourselves, sadly with a method that is not well suited for
upstreaming).

3. Another thing is, of course: stop adding features to the core hypervisor. We
really need Xen to finally mature, stabilize, and for its development process to
be slowing down over time (just the bug fixes). We need a long-term-supported
hypervisor, which doesn't change with subsonic speed. This would allow this core
code to be widely audited by many experts. If some users want features, these
should perhaps be maintained as additional modules (no, I don't mean dynamically
loaded modules, just compile-time included), preferably in separate repos.

Perhaps also to move all the non-hypervisor code, such as all the toolstacks,
stubdom, etc, into separate repos also. For hygiene, if for nothing else.

Admittedly, some of the features are a result of hardware evolution, such as
e.g. UEFI support. But many are not. Again, maintaining these as optional code
(in separate repos) would be a great step into getting the hypervisor maturing,
finally.

I have already written about it years ago [9], as a matter of fact.

4. Finally, I've been really surprised by the line of reasoning Ian expressed in
the above-mentioned blog post. TL;DR: "we're still doing pretty great, compared
to other projects, because: 1) we have smaller number of publicly disclosed
bugs, and 2) we actually publicly disclose these bugs which we are aware of".

The attitude presented in the blog post is so wrong, that I'm not even sure
where to start commenting on this...

With a single bug like the XSA-148 which, let me repeat that one more time: had
been present in the hypervisor for the last 7 years, so with a bug like this it
really doesn't matter how many (i.e. what number) of critical bugs does the
competition have. Because only one bug of this calibre is enough for the
attacker to never really bother to find another one. The mere fact that
competing hypervisors might got 12 bugs during the same period, really doesn't
make Xen look any better, sorry.

Also, there is really nothing to be proud that you disclose the bugs. It would
be a problem if you didn't.

Hope the above comments might help improve the Xen security. Perhaps some would
perceive them as arrogant or rude. Too bad. Remember the actual attackers will
not be arrogant or rude -- they will just come and exploit bugs, silently.
Admittedly this might not hurt some of the developers ego, not in the short time
at least.

Can we, the Qubes OS project, or myself personally, help with implementing the
above suggestions? Sadly, no. While some of us do contribute occasional patches
to Xen (specifically Marek Marczykowski-GÃrecki), we really work for a different
project and have different tasks and responsibilities.

Regards,
joanna.

[1] http://xenbits.xen.org/xsa/advisory-148.html
[2] https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-022-2015.txt
[3] https://blog.xenproject.org/2015/10/30/security-vs-features/
[4] https://twitter.com/xen_org/status/660151720463482880
[5] http://xenbits.xen.org/xsa/advisory-109.html
[6] http://xenbits.xen.org/xsa/advisory-123.html
[7] https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-018-2015.txt
[8] https://github.com/QubesOS/qubes-vmm-xen/commit/dcd6c0a4f2c6226a9b706e62469d420579c86975
[9] http://lists.xen.org/archives/html/xen-devel/2013-09/msg01815.html
-----BEGIN PGP SIGNATURE-----

iQIcBAEBAgAGBQJWPOHUAAoJEDOT2L8N3GcYzpEP/An/PTnKDOaC4tPKw3+Y8VIL
n/xIfRPRnPRy18Tbx6EnrKzgohtVvtigtmd/FIxjYVuZ3Luw2B4RFSqUENg758Aa
ANLs4kUD+yaUO82Jfg1nq/6PXBNZlKFovQuYV20LEW9JV6DvMCbzYJ2evZ6t0XS/
EAhUOP1OqY4vb0kah4dwhQKepqwPcD5Tm5LLZn/qbO30e2zN9MkKB851vguQtVIz
k5I8pv+MSQp1efRG2eg470onGtU36IIYFsY1OLihJA9MYh+74FpIA1xoURenJg6+
NJhXEDnxxlz78BJaGOiSwHwB59yd2DXDJKAaUNV/H1LqQu3o1ED+8IZWUARkc0Wl
ckTfQz/++exDhyRcWVHF5GnxEHWdyu/gNZOCNjl4o4HiYS4SQrhTRn7rWwalbyBB
/jG3bAnU8m/Gtp95FtuWXCwuKeeOeBSfnxKMrksxu3JFSNevsYPZu5lrdtUEGLZA
97SwLj70GLesvMqEV3k7XmrQyt8LwyBzLCm+cCocaPEmOQAymeuslrs/RehjGSCQ
L34Ipjvg85GoND64N8X56NuvD+/LrteRhp8hS+aEWv2YpNVJB9tmcOTzLzf7faPK
KPyqa2lW7XKwm7n0WCbtPnrVHRbFPbKvkJRDnfPDwEAEiZcj0SjJ8h7fIDSQ4qac
vgYBTJr/2cRrtC4y1An5
=zQmw
-----END PGP SIGNATURE-----

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel