Xen project Mailing List

Re: [PATCH v1 0/7] Implement support for external IPT monitoring

To: Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx>

From: Michał Leszczyński <michal.leszczynski@xxxxxxx>

Date: Wed, 17 Jun 2020 21:30:52 +0200 (CEST)

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Wed, 17 Jun 2020 19:31:43 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: BsYEZrJ7asFdwFXtpYfFwvq1Ux0v0g==

Thread-topic: Implement support for external IPT monitoring

----- 17 cze 2020 o 18:27, Tamas K Lengyel tamas.k.lengyel@xxxxxxxxx napisał(a): > On Wed, Jun 17, 2020 at 10:19 AM Andrew Cooper > <andrew.cooper3@xxxxxxxxxx> wrote: >> >> On 17/06/2020 04:02, Tamas K Lengyel wrote: >> > On Tue, Jun 16, 2020 at 2:17 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> > wrote: >> >> On 16/06/2020 19:47, Michał Leszczyński wrote: >> >>> ----- 16 cze 2020 o 20:17, Andrew Cooper andrew.cooper3@xxxxxxxxxx >> >>> napisał(a): >> >>> >> >>>> Are there any restrictions on EPT being enabled in the first place? I'm >> >>>> not aware of any, and in principle we could use this functionality for >> >>>> PV guests as well (using the CPL filter). Therefore, I think it would >> >>>> be helpful to not tie the functionality to HVM guests, even if that is >> >>>> the only option enabled to start with. >> >>> I think at the moment it's not required to have EPT. This patch series >> >>> doesn't >> >>> use any translation feature flags, so the output address is always a >> >>> machine >> >>> physical address, regardless of context. I will check if it could be >> >>> easily >> >>> used with PV. >> >> If its trivial to add PV support then please do. If its not, then don't >> >> feel obliged, but please do at least consider how PV support might look >> >> in the eventual feature. >> >> >> >> (Generally speaking, considering "how would I make this work in other >> >> modes where it is possible" leads to a better design.) >> >> >> >>>> The buffer mapping and creation logic is fairly problematic. Instead of >> >>>> fighting with another opencoded example, take a look at the IOREQ >> >>>> server's use of "acquire resource" which is a mapping interface which >> >>>> supports allocating memory on behalf of the guest, outside of the guest >> >>>> memory, for use by control tools. >> >>>> >> >>>> I think what this wants is a bit somewhere in domain_create to indicate >> >>>> that external tracing is used for this domain (and allocate whatever >> >>>> structures/buffers are necessary), acquire resource to map the buffers >> >>>> themselves, and a domctl for any necessary runtime controls. >> >>>> >> >>> I will check this out, this sounds like a good option as it would remove >> >>> lots of >> >>> complexity from the existing ipt_enable domctl. >> >> Xen has traditionally opted for a "and turn this extra thing on >> >> dynamically" model, but this has caused no end of security issues and >> >> broken corner cases. >> >> >> >> You can see this still existing in the difference between >> >> XEN_DOMCTL_createdomain and XEN_DOMCTL_max_vcpus, (the latter being >> >> required to chose the number of vcpus for the domain) and we're making >> >> good progress undoing this particular wart (before 4.13, it was >> >> concerning easy to get Xen to fall over a NULL d->vcpu[] pointer by >> >> issuing other hypercalls between these two). >> >> >> >> There is a lot of settings which should be immutable for the lifetime of >> >> the domain, and external monitoring looks like another one of these. >> >> Specifying it at createdomain time allows for far better runtime >> >> behaviour (you are no longer in a situation where the first time you try >> >> to turn tracing on, you end up with -ENOMEM because another VM booted in >> >> the meantime and used the remaining memory), and it makes for rather >> >> more simple code in Xen itself (at runtime, you can rely on it having >> >> been set up properly, because a failure setting up will have killed the >> >> domain already). >> > I'm not in favor of this being a flag that gets set during domain >> > creation time. It could certainly be the case that some users would >> > want this being on from the start till the end but in other cases you >> > may want to enable it intermittently only for some time in-between >> > particular events. If it's an on/off flag during domain creation you >> > pretty much force that choice on the users and while the overhead of >> > PT is better than say MTF it's certainly not nothing. In case there is >> > an OOM situation enabling IPT dynamically the user can always just >> > pause the VM and wait till memory becomes available. >> >> There is nothing wrong with having "turn tracing on/off at runtime" >> hypercalls. It is specifically what I suggested two posts up in this >> thread, but it should be limited to the TraceEn bit in RTIT_CTL. >> >> What isn't ok is trying to allocate the buffers, write the TOPA, etc on >> first-enable or first-map, because the runtime complexity of logic like >> this large, and far too easy to get wrong in security relevant ways. >> >> The domain create flag would mean "I wish to use tracing with this >> domain", and not "I want tracing enabled from the getgo". > > Gotcha, that's reasonable. > I think I also agree with this, i.e. to alloc buffers on domain creation and just enable/disable the feature in runtime. This would remove some complexity from runtime. I think it's usually (always?) known in advance whether we would like to use external monitoring on a domain or not. I will try to adapt this approach in patch v2. >> >> >>>> What semantics do you want for the buffer becoming full? Given that >> >>>> debugging/tracing is the goal, I presume "pause vcpu on full" is the >> >>>> preferred behaviour, rather than drop packets on full? >> >>>> >> >>> Right now this is a ring-style buffer and when it would become full it >> >>> would >> >>> simply wrap and override the old data. >> >> How does the consumer spot that the data has wrapped? What happens if >> >> data starts getting logged, but noone is listening? What happens if the >> >> consumer exits/crashes/etc and stops listening as a consequence? >> >> >> >> It's fine to simply state what will happen, and possibly even "don't do >> >> that then", but the corner cases do at least need thinking about. >> > AFAIU the current use-case is predominantly to be used in conjunction >> > with VMI events where you want to be able to see the trace leading up >> > to a particular vmexit. So in the case when the buffer is wrapped >> > in-between events and data is lost that's not really of concern. >> >> That's all fine. I imagine the output here is voluminous, and needs >> help being cut down as much as possible. >> >> On a tangent, I presume you'd like to include VM-fork eventually, which >> ought to include copying the trace buffer on fork? > > I would eventually like to use it to reconstruct the branch history so > we can update AFL's coverage map with that instead of having to do the > current breakpoint-singlestep dance. But for that I would only care > about the trace starting after the fork, so copying the parent's PT > buffer is not needed. We'll also probably only use PT if the branch > history is larger than what LBR can hold. I asked Michal to name the > hypercall interface "vmtrace" for this reason so we can add other > stuff like LBR later using the same interface (which I already > implemented in https://github.com/tklengyel/xen/commits/lbr). > > Tamas

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.