Xen project Mailing List

Re: [Xen-devel] [PATCH v2 REPOST 12/12] x86/hvm/ioreq: add a new mappable resource type...

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>

From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>

Date: Fri, 25 Aug 2017 09:46:48 +0000

Accept-language: en-GB, en-US

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, "Tim \(Xen.org\)" <tim@xxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Fri, 25 Aug 2017 09:46:53 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHTG1mkPGuQigsZEUSVlxTIXuP2AKKUsemAgAAh3CA=

Thread-topic: [Xen-devel] [PATCH v2 REPOST 12/12] x86/hvm/ioreq: add a new mappable resource type...

> -----Original Message----- > From: Roger Pau Monne > Sent: 25 August 2017 10:32 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Stefano Stabellini > <sstabellini@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Andrew Cooper > <Andrew.Cooper3@xxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Tim > (Xen.org) <tim@xxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx> > Subject: Re: [Xen-devel] [PATCH v2 REPOST 12/12] x86/hvm/ioreq: add a > new mappable resource type... > > On Tue, Aug 22, 2017 at 03:51:06PM +0100, Paul Durrant wrote: > > ... XENMEM_resource_ioreq_server > > > > This patch adds support for a new resource type that can be mapped using > > the XENMEM_acquire_resource memory op. > > > > If an emulator makes use of this resource type then, instead of mapping > > gfns, the IOREQ server will allocate pages from the heap. These pages > > will never be present in the P2M of the guest at any point and so are > > not vulnerable to any direct attack by the guest. They are only ever > > accessible by Xen and any domain that has mapping privilege over the > > guest (which may or may not be limited to the domain running the > emulator). > > > > NOTE: Use of the new resource type is not compatible with use of > > XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns > flag is > > set. > > > > Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > > --- > > Cc: Jan Beulich <jbeulich@xxxxxxxx> > > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > > Cc: Tim Deegan <tim@xxxxxxx> > > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > > --- > > xen/arch/x86/hvm/ioreq.c | 136 > ++++++++++++++++++++++++++++++++++++++++ > > xen/arch/x86/mm.c | 27 ++++++++ > > xen/include/asm-x86/hvm/ioreq.h | 2 + > > xen/include/public/hvm/dm_op.h | 4 ++ > > xen/include/public/memory.h | 3 + > > 5 files changed, 172 insertions(+) > > > > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c > > index 795c198f95..9e6838dab6 100644 > > --- a/xen/arch/x86/hvm/ioreq.c > > +++ b/xen/arch/x86/hvm/ioreq.c > > @@ -231,6 +231,15 @@ static int hvm_map_ioreq_gfn(struct > hvm_ioreq_server *s, bool buf) > > struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq; > > int rc; > > > > + if ( iorp->page ) > > + { > > + /* Make sure the page has not been allocated */ > > + if ( gfn_eq(iorp->gfn, INVALID_GFN) ) > > + return -EPERM; > > + > > + return 0; > > EEXIST? (See comment below, which I think also applies here). > > > + } > > + > > if ( d->is_dying ) > > return -EINVAL; > > > > @@ -253,6 +262,60 @@ static int hvm_map_ioreq_gfn(struct > hvm_ioreq_server *s, bool buf) > > return rc; > > } > > > > +static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf) > > +{ > > + struct domain *currd = current->domain; > > + struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq; > > + > > + if ( iorp->page ) > > + { > > + /* Make sure the page has not been mapped */ > > + if ( !gfn_eq(iorp->gfn, INVALID_GFN) ) > > + return -EPERM; > > + > > + return 0; > > Shouldn't this return EEXIST? Page has already been allocated by a > previous call AFAICT, and it seems like a possible error/misbehavior > to try to do it twice. > The checks are there to prevent a caller from trying to mix the legacy and new methods of mapping ioreq server pages so EPERM (i.e. 'operation not permitted') seems like the correct error. I agree that it's not obvious, at this inner level, that I do think this is right. I'm open to debate about this though. > > + } > > + > > + /* > > + * Allocated IOREQ server pages are assigned to the emulating > > + * domain, not the target domain. This is because the emulator is > > + * likely to be destroyed after the target domain has been torn > > + * down, and we must use MEMF_no_refcount otherwise page > allocation > > + * could fail if the emulating domain has already reached its > > + * maximum allocation. > > + */ > > + iorp->page = alloc_domheap_page(currd, MEMF_no_refcount); > > I don't really like the fact that the page is not accounted for any > domain, but I can see the point in doing it like that (which you > argument in the comment). > > IIRC there where talks about tightening the accounting of memory > pages, so that ideally everything would be accounted for in the memory > assigned to the domain. > > Just some random through, but could the toolstack set aside some > memory pages (ie: not map them into the domain p2m), that could then > be used by this? (not asking you to do this here) > > And how many pages are we expecting to use for each domain? I assume > the number will be quite low. > Yes, I agree the use on MEMF_no_refcount is not ideal and you do highlight an issue: I don't think there is currently an upper limit on the number of ioreq servers so an emulating domain could exhaust memory using the new scheme. I'll need to introduce a limit to avoid that. > > + if ( !iorp->page ) > > + return -ENOMEM; > > + > > + get_page(iorp->page, currd); > > Do you really need this get_page? (page is already assigned to currd). > Hmm. I thought MEMF_no_refcount modified that behaviour, but I may have got confused with MEMF_no_owner. > > + > > + iorp->va = __map_domain_page_global(iorp->page); > > + if ( !iorp->va ) > > + { > > + put_page(iorp->page); > > + iorp->page = NULL; > > + return -ENOMEM; > > + } > > + > > + clear_page(iorp->va); > > + return 0; > > +} > > + > > +static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf) > > +{ > > + struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq; > > + > > + if ( !iorp->page ) > > + return; > > + > > + unmap_domain_page_global(iorp->va); > > + iorp->va = NULL; > > + > > + put_page(iorp->page); > > + iorp->page = NULL; > > +} > > + > > bool is_ioreq_server_page(struct domain *d, const struct page_info > *page) > > { > > const struct hvm_ioreq_server *s; > > @@ -457,6 +520,27 @@ static void > hvm_ioreq_server_unmap_pages(struct hvm_ioreq_server *s) > > hvm_unmap_ioreq_gfn(s, false); > > } > > > > +static int hvm_ioreq_server_alloc_pages(struct hvm_ioreq_server *s) > > +{ > > + int rc = -ENOMEM; > > + > > + rc = hvm_alloc_ioreq_mfn(s, false); > > + > > + if ( !rc && (s->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF) ) > > + rc = hvm_alloc_ioreq_mfn(s, true); > > + > > + if ( rc ) > > + hvm_free_ioreq_mfn(s, false); > > + > > + return rc; > > +} > > + > > +static void hvm_ioreq_server_free_pages(struct hvm_ioreq_server *s) > > +{ > > + hvm_free_ioreq_mfn(s, true); > > + hvm_free_ioreq_mfn(s, false); > > +} > > + > > static void hvm_ioreq_server_free_rangesets(struct hvm_ioreq_server > *s) > > { > > unsigned int i; > > @@ -583,7 +667,18 @@ static int hvm_ioreq_server_init(struct > hvm_ioreq_server *s, > > > > fail_add: > > hvm_ioreq_server_remove_all_vcpus(s); > > + > > + /* > > + * NOTE: It is safe to call both hvm_ioreq_server_unmap_pages() and > > + * hvm_ioreq_server_free_pages() in that order. > > + * This is because the former will do nothing if the pages > > + * are not mapped, leaving the page to be freed by the latter. > > + * However if the pages are mapped then the former will set > > + * the page_info pointer to NULL, meaning the latter will do > > + * nothing. > > + */ > > hvm_ioreq_server_unmap_pages(s); > > + hvm_ioreq_server_free_pages(s); > > > > return rc; > > } > > @@ -593,6 +688,7 @@ static void hvm_ioreq_server_deinit(struct > hvm_ioreq_server *s) > > ASSERT(!s->enabled); > > hvm_ioreq_server_remove_all_vcpus(s); > > hvm_ioreq_server_unmap_pages(s); > > + hvm_ioreq_server_free_pages(s); > > hvm_ioreq_server_free_rangesets(s); > > } > > > > @@ -745,6 +841,9 @@ int hvm_get_ioreq_server_info(struct domain *d, > ioservid_t id, > > rc = hvm_ioreq_server_map_pages(s); > > if ( rc ) > > break; > > + > > + gdprintk(XENLOG_INFO, "d%d ioreq server %u using gfns\n", > > + d->domain_id, s->id); > > } > > > > if ( ioreq_gfn ) > > @@ -767,6 +866,43 @@ int hvm_get_ioreq_server_info(struct domain *d, > ioservid_t id, > > return rc; > > } > > > > +mfn_t hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id, > > + unsigned int idx) > > +{ > > + struct hvm_ioreq_server *s; > > + mfn_t mfn = INVALID_MFN; > > + > > + spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock); > > + > > + list_for_each_entry ( s, > > + &d->arch.hvm_domain.ioreq_server.list, > > + list_entry ) > > + { > > + int rc; > > + > > + if ( s == d->arch.hvm_domain.default_ioreq_server ) > > + continue; > > s->is_default > > > + > > + if ( s->id != id ) > > + continue; > > + > > + rc = hvm_ioreq_server_alloc_pages(s); > > + if ( rc ) > > + break; > > + > > + if ( idx == 0 ) > > + mfn = _mfn(page_to_mfn(s->bufioreq.page)); > > + else if ( idx == 1 ) > > + mfn = _mfn(page_to_mfn(s->ioreq.page)); > > + > > + break; > > + } > > + > > + spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock); > > + > > + return mfn; > > +} > > + > > int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id, > > uint32_t type, uint64_t start, > > uint64_t end) > > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > > index 4e86f0a2ab..3e845af0e4 100644 > > --- a/xen/arch/x86/mm.c > > +++ b/xen/arch/x86/mm.c > > @@ -122,6 +122,7 @@ > > #include <asm/fixmap.h> > > #include <asm/io_apic.h> > > #include <asm/pci.h> > > +#include <asm/hvm/ioreq.h> > > > > /* Mapping of the fixmap space needed early. */ > > l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) > > @@ -4744,6 +4745,27 @@ static int xenmem_acquire_grant_table(struct > domain *d, > > return 0; > > } > > > > +static int xenmem_acquire_ioreq_server(struct domain *d, > > + unsigned int id, > > + unsigned long frame, > > + unsigned long nr_frames, > > + unsigned long mfn_list[]) > > mfn_t maybe? No. The list is mapped in so use of typesafe variants is not appropriate. > > > +{ > > + unsigned int i; > > Either nr_frames wants to be unsigned int, or this needs to be > unsigned long. Not sure which. I'll check. > > > + > > + for ( i = 0; i < nr_frames; i++ ) > > + { > > + mfn_t mfn = hvm_get_ioreq_server_frame(d, id, frame + i); > > Here you use unsigned long as the last parameter to > hvm_get_ioreq_server_frame, while the function takes an unsigned int. Ok. Thanks, Paul > > Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.