[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Livepatching and Xen Security
On 18/05/17 17:40, George Dunlap wrote: > There are four general areas I think there may be bugs. > > ## Unprivileged access to Livepatching hypercalls > > ## Bugs in the patch creation tools which create patches with vulnerabilities > > ## Bugs in the patch-application code such that vulnerabilities exist > after application > > ## Bugs which allow a guest to prevent the application of a livepatch > > # Testing status and requirements > > I'm told we already test that unprivileged guests cannot access the > Livepatch hypercalls in osstest; if so, that aspect should should be > covered. Specifically, http://xenbits.xen.org/docs/xtf/test-livepatch-priv-check.html (There is a docs bug I have noticed while grabbing that link, which I have just pushed a fix for. The live docs will be updated whenever cron next runs.) (I'd also like to take this opportunity to highlight an issue which became apparent while writing that test; unstable hypercall ABIs, wherever they reside, make this kind of testing prone to false negatives.) > All that's needed would be for vendors to describe what kinds of > testing they have done for Livepatching. I think there are two > factors which come into play: > > 1. Having tested live-patching thoroughly for at least some version of > the codebase > > 2. Having tested live-patching for one of the Xen 4.9 RCs. > > Thoughts? As a statement of what XenServer is doing: XenServer 7.1 is based on Xen 4.7 and we are providing livepatches (where applicable) with hypervisor hotfixes. Thus far, XSAs 204, 207, 212-215 have been included in livepatch form, as well as a number of other general bugfixes which were save to livepatch. For both these hotfixes, we had to bugfix the livepatch creation tools to generate a livepatch. We also had to modify the XSA 213 patch to create a livepatch. The (pre 4.8) Xen code was buggy and used the .fixup section when it should have used .text.unlikley, which caused the livepatch tools to fail a cross-check of the exception frame references when building the patch. Thus, we are 0 for 2 on the tools being able to DTRT when given a set of real-world fixes. Independent of this, the nature of what qualifies as "a correct patch" is subjective and very context dependent. Consider a scenario with two users, the same version of the livepatch tools, an identical source patch, and an identical source version of Xen. There is a very real possibility that these two users could get one valid and one invalid patch based solely on something like the compiler settings used to build the hypervisor they are patching. From an XSA point of view, we do not want to be issuing advisories saying "If you are on OS $A, with livepatch tools $B, Hypervisor $C compiled with these specific build options, then trying to create a livepatch for patch $D will appear to work properly but leave a timebomb in your hypervisor". ISTR an issue which hit during development was where CentOS releasing an minor update to GCC and caused chaos by altering how the string literals got sorted. What if a user creates a livepatch for a change which isn't remotely safe to livepatch, uploads it, and their hypervisor goes bang? This would qualify under the definition of "correct" in so far as the patch was correctly doing what it was told, and thus, fall within the security criteria presented here. There is already a very high user requirement in the first place to evaluate whether patches are safe to livepatch. This includes interaction with other livepatches, interactions with patches in the vendors patch queue, interaction with customer hardware, and there is no way this can be decided automatically. Therefore, I think it would be a mistake for us to include anything pertaining to "creating a livepatch, correct or otherwise" within a support statement. There are many variables which we as upstream can't control. As for the 4th point, about what a guest can do to prevent application of a livepatch. The default timeout is insufficient to quiesce Xen if a VM with a few VCPUs is migrating. In this scenario, I believe p2m_lock contention is the underlying reason, but the point stands that there are plenty of things a guest can do to prevent Xen being able to suitably quiesce. As a host administrator attempting to apply the livepatch, you get informed that Xen failed to quiesce and the livepatch application failed. Options range from upping the timeout on the next patching attempt, to possibly even manually pausing the troublesome VM for a second. I also think it unwise to consider any scenarios like this within the security statement, otherwise we will have to issue an XSA stating "Guests doing normal unprivileged things can cause Xen to be insufficient quiescent to apply livepatches with the deliberately conservative defaults". What remediation would we suggest for this? On the points of unexpected access to the hypercalls, and Xen doing the wrong thing when presented with a legitimate correct livepatch, I think these are in principle fine for inclusion within a support statement. I would ask however how confident we are that there are no ELF parsing bugs in the code? I think it might be very prudent to try and build a userspace harness for it and let ALF have a go. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |