Xen project Mailing List

Re: [Xen-devel] [PATCH RFC V9 4/5] xen, libxc: Request page fault injection via libxc

From: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>

Date: Tue, 02 Sep 2014 12:18:06 +0300

Cc: kevin.tian@xxxxxxxxx, eddie.dong@xxxxxxxxx, stefano.stabellini@xxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, tim@xxxxxxx, jun.nakajima@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, ian.jackson@xxxxxxxxxxxxx, ian.campbell@xxxxxxxxxx

Comment: DomainKeys? See http://domainkeys.sourceforge.net/

Delivery-date: Tue, 02 Sep 2014 09:18:21 +0000

Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=bitdefender.com; b=Ru6nboVN292xa1wPDH2dSzj/Vj8+rmUxt5ZjNB9M6H7SEx3VQu0XXTVaX939bfV0r9I18RJ0DuIUvIgh28kCOD2CUZpe6XiaFmupnBufVF7cCOYyw9dTfcQ2jBvSVQctPjFM5DgrNmx5oKW9S0yyBDCpHZn7JxeEWxPVTY9S70M8vx8OZdRPPfgWfDwEW4qNkFHPpeCSH8bBHRHbOVGz8fNP2T++pbt2D+mHpg0DVQLOR/5wkh27rZGKKbdj5tGo5PrpbHrMWLQ4rdz1z1gbQW+0KIv313Er3m1K6TLZwUaRtqSNNNIKtkZ4PfojC9UFezbLGBSbako++NsNrfWesw==; h=Received:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-BitDefender-Scanner:X-BitDefender-Spam:X-BitDefender-SpamStamp:X-BitDefender-CF-Stamp;

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 09/01/2014 03:05 PM, Jan Beulich wrote: >>>> On 01.09.14 at 13:54, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >> On 09/01/2014 12:08 PM, Jan Beulich wrote: >>>>>> On 01.09.14 at 09:36, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>> On 08/29/2014 12:27 PM, Jan Beulich wrote: >>>>>>>> On 29.08.14 at 09:44, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>>>>> I do understand the preference for a VCPU-based mechanism from a >>>>>> concurrency point of view, but that would simply potentially fail for >>>>>> us, hence defeating the purpose of the patch. I'm also not sure how that >>>>>> would be useful in the general case either, since the same problem that >>>>>> applies to us would seem to apply to the general case as well. >>>>> >>>>> Yeah, the whole thing probably needs a bit more thinking so that the >>>>> interface doesn't end up being a BitDefender-special. Indeed together >>>>> with the address space qualification, the interface might not be very >>>>> useful when made vCPU-bound. And taking it a little further into the >>>>> "generic" direction, allowing this to only inject #PF doesn't make a >>>>> very nice interface either. Plus we already have HVMOP_inject_trap, >>>>> i.e. your first line of thinking (and eventual explaining as the >>>>> motivation for a patch) should be why that can't be used. >>>> >>>> I'd say that it's memory-introspection specific rather than 3rd-party >>>> vendor specific. Without this this patch, memory-introspection support >>>> in general is impacted / less flexible, since there's no other way to >>>> bring swapped out pages back in. >>>> >>>> For all the reasons you've explained (at least as far as I understand >>>> it) there's not much room to go more generic - so maybe just renaming >>>> the libxc wrapper to something more specific ( >>>> xc_domain_request_usermode_pagefault?) is the solution here? >>> >>> Maybe, but only after you explained why the existing interface can >>> neither be used nor suitably extended. >> >> As far as I understand the HVMOP_inject_trap interface, it is simply (in >> this case) a way to trigger the equivalent of hvm_inject_page_fault() >> from userspace (via libxc). >> >> We need two additional checks: >> >> 1. That CR3 matches, because the way the application works, we need to >> inject a page fault related to the address space of whatever process is >> interesting at the time, and >> >> 2. That we're in usermode (the CPL check), because we know that it's >> safe to inject a page fault when we're in user mode, but it's not always >> safe to do so in kernel mode. >> >> The current interface seems to be a low-level, basic wrapper around >> hvm_inject_trap(). >> >> What we're trying to do is ask for a page fault when we're both A) in >> usermode, and B) when a target process matches - and are only interested >> in page faults, no other trap vector. >> >> Technically, the checks could, probably, be moved into (and new >> parameters added to) hvm_inject_trap() & friends, but it seems unlikely >> that users other than memory introspection applications would be >> interested in them, while they would possibly add to the complexity of >> the interface. The rest of the clients would have to carry dummy >> parameters around to use it. > > I'm not sure: Injecting faults with certain restrictions (like in your > case CPL and CR3) does seem to make quite some sense in > general when the fault isn't intended to be terminal (of a process > or the entire VM). It's just that so far we used fault injection only > to indicate error conditions. > > But as always - let's see if others are of differing opinion before > settling on either route. While waiting for other opinions, I've started looking at the HVMOP_inject_trap code, and it uses a per-VCPU data-saving model similar to the one we've been discussing (and agreed that does not work for the explained use-cases). The first giveaway is the libxc function: /* * Injects a hardware/software CPU trap, to take effect the next time the HVM * resumes. */ int xc_hvm_inject_trap( xc_interface *xch, domid_t dom, int vcpu, uint32_t vector, uint32_t type, uint32_t error_code, uint32_t insn_len, uint64_t cr2); which takes an "int vcpu" parameter. Then, this happens in xen/arch/x86/hvm/hvm.c: case HVMOP_inject_trap: { xen_hvm_inject_trap_t tr; struct domain *d; struct vcpu *v; if ( copy_from_guest(&tr, arg, 1 ) ) return -EFAULT; rc = rcu_lock_remote_domain_by_id(tr.domid, &d); if ( rc != 0 ) return rc; rc = -EINVAL; if ( !is_hvm_domain(d) ) goto param_fail8; rc = xsm_hvm_control(XSM_DM_PRIV, d, op); if ( rc ) goto param_fail8; rc = -ENOENT; if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid]) == NULL ) goto param_fail8; if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) rc = -EBUSY; else { v->arch.hvm_vcpu.inject_trap.vector = tr.vector; v->arch.hvm_vcpu.inject_trap.type = tr.type; v->arch.hvm_vcpu.inject_trap.error_code = tr.error_code; v->arch.hvm_vcpu.inject_trap.insn_len = tr.insn_len; v->arch.hvm_vcpu.inject_trap.cr2 = tr.cr2; rc = 0; } param_fail8: rcu_unlock_domain(d); break; } The "v->arch.hvm_vcpu.inject_trap.<member> = initializer;" code again points to a per-VCPU implementation. And finally, hvm_do_resume() is also VCPU-dependent: void hvm_do_resume(struct vcpu *v) { struct domain *d = v->domain; struct hvm_ioreq_server *s; check_wakeup_from_wait(); if ( is_hvm_vcpu(v) ) pt_restore_timer(v); list_for_each_entry ( s, &d->arch.hvm_domain.ioreq_server.list, list_entry ) { struct hvm_ioreq_vcpu *sv; list_for_each_entry ( sv, &s->ioreq_vcpu_list, list_entry ) { if ( sv->vcpu == v ) { if ( !hvm_wait_for_io(sv, get_ioreq(s, v)) ) return; break; } } } /* Inject pending hw/sw trap */ if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) { hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); v->arch.hvm_vcpu.inject_trap.vector = -1; } } While we need to set the data per-domain and have whatever VCPU inject the page fault - _but_only_if_ it's in usermode and its CR3 points to something interesting. Hope I've been following the code correctly. Thanks, Razvan Cojocaru _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.