[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v12 5/6] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries.
On 07/04/17 15:05, Yu Zhang wrote: > > > On 4/7/2017 9:56 PM, George Dunlap wrote: >> On 07/04/17 12:31, Jan Beulich wrote: >>>>>> On 07.04.17 at 12:55, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote: >>>> On 4/7/2017 6:26 PM, Jan Beulich wrote: >>>>>>>> On 07.04.17 at 11:53, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote: >>>>>> On 4/7/2017 5:40 PM, Jan Beulich wrote: >>>>>>>>>> On 06.04.17 at 17:53, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote: >>>>>>>> @@ -965,7 +987,7 @@ static mfn_t ept_get_entry(struct p2m_domain >>>>>>>> *p2m, >>>>>>>> if ( is_epte_valid(ept_entry) ) >>>>>>>> { >>>>>>>> if ( (recalc || ept_entry->recalc) && >>>>>>>> - p2m_is_changeable(ept_entry->sa_p2mt) ) >>>>>>>> + p2m_check_changeable(ept_entry->sa_p2mt) ) >>>>>>> I think the distinction between these two is rather arbitrary, and I >>>>>>> also think this is part of the problem above: Distinguishing >>>>>>> log-dirty >>>>>>> from ram-rw requires auxiliary data to be consulted. The same >>>>>>> ought to apply to ioreq-server, and then there wouldn't be a need >>>>>>> to have two p2m_*_changeable() flavors. >>>>>> Well, I think we have also discussed this quite long ago, here is >>>>>> the link. >>>>>> https://lists.xen.org/archives/html/xen-devel/2016-09/msg01017.html >>>>> That was a different discussion, namely not considering this ... >>>>> >>>>>>> Of course the subsequent use p2m_is_logdirty_range() may then >>>>>>> need amending. >>>>>>> >>>>>>> In the end it looks like you have the inverse problem here compared >>>>>>> to above: You should return ram-rw when the reset was already >>>>>>> initiated. At least that's how I would see the logic to match up >>>>>>> with >>>>>>> the log-dirty handling (where the _effective_ rather than the last >>>>>>> stored type is being returned). >>>>> ... at all. >>>> Sorry I still do not get it. >>> Leaving ioreq-server out of the picture, what the function returns >>> to the caller is the type as it would be if a re-calculation would have >>> been done on the entry, even if that hasn't happened yet (the >>> function here shouldn't change any entries after all). I think we >>> want the same behavior here for what have been ioreq-server >>> entries (and which would become ram-rw after the next re-calc). >> How about something like the attached? (In-lined for convenience.) >> >> -George >> >> After an ioreq server has unmapped, the remaining p2m_ioreq_server >> entries need to be reset back to p2m_ram_rw. This patch does this >> asynchronously with the current p2m_change_entry_type_global() >> interface. >> >> New field entry_count is introduced in struct p2m_domain, to record >> the number of p2m_ioreq_server p2m page table entries. One nature of >> these entries is that they only point to 4K sized page frames, because >> all p2m_ioreq_server entries are originated from p2m_ram_rw ones in >> p2m_change_type_one(). We do not need to worry about the counting for >> 2M/1G sized pages. >> >> This patch disallows mapping of an ioreq server, when there's still >> p2m_ioreq_server entry left, in case another mapping occurs right after >> the current one being unmapped, releases its lock, with p2m table not >> synced yet. >> >> This patch also disallows live migration, when there's remaining >> p2m_ioreq_server entry in p2m table. The core reason is our current >> implementation of p2m_change_entry_type_global() lacks information >> to resync p2m_ioreq_server entries correctly if global_logdirty is >> on. >> >> We still need to handle other recalculations, however; which means >> that when doing a recalculation, if the current type is >> p2m_ioreq_server, we check to see if p2m->ioreq.server is valid or >> not. If it is, we leave it as type p2m_ioreq_server; if not, we reset >> it to p2m_ram as appropriate. >> >> To avoid code duplication, lift recalc_type() out of p2m-pt.c and use >> it for all type recalculations (both in p2m-pt.c and p2m-ept.c). >> >> Signed-off-by: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx> >> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx> >> Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx> >> --- >> xen/arch/x86/hvm/ioreq.c | 8 ++++++ >> xen/arch/x86/mm/hap/hap.c | 9 ++++++ >> xen/arch/x86/mm/p2m-ept.c | 73 >> +++++++++++++++++++++++++++++------------------ >> xen/arch/x86/mm/p2m-pt.c | 58 ++++++++++++++++++++++++------------- >> xen/arch/x86/mm/p2m.c | 9 ++++++ >> xen/include/asm-x86/p2m.h | 26 ++++++++++++++++- >> 6 files changed, 135 insertions(+), 48 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c >> index 5bf3b6d..07a6c26 100644 >> --- a/xen/arch/x86/hvm/ioreq.c >> +++ b/xen/arch/x86/hvm/ioreq.c >> @@ -955,6 +955,14 @@ int hvm_map_mem_type_to_ioreq_server(struct domain >> *d, ioservid_t id, >> >> spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock); >> >> + if ( rc == 0 && flags == 0 ) >> + { >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + >> + if ( read_atomic(&p2m->ioreq.entry_count) ) >> + p2m_change_entry_type_global(d, p2m_ioreq_server, >> p2m_ram_rw); >> + } >> + >> return rc; >> } >> >> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c >> index a57b385..4b591fe 100644 >> --- a/xen/arch/x86/mm/hap/hap.c >> +++ b/xen/arch/x86/mm/hap/hap.c >> @@ -187,6 +187,15 @@ out: >> */ >> static int hap_enable_log_dirty(struct domain *d, bool_t log_global) >> { >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + >> + /* >> + * Refuse to turn on global log-dirty mode if >> + * there are outstanding p2m_ioreq_server pages. >> + */ >> + if ( log_global && read_atomic(&p2m->ioreq.entry_count) ) >> + return -EBUSY; >> + >> /* turn on PG_log_dirty bit in paging mode */ >> paging_lock(d); >> d->arch.paging.mode |= PG_log_dirty; >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >> index cc1eb21..9f41067 100644 >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -533,6 +533,7 @@ static int resolve_misconfig(struct p2m_domain *p2m, >> unsigned long gfn) >> { >> for ( gfn -= i, i = 0; i < EPT_PAGETABLE_ENTRIES; ++i ) >> { >> + p2m_type_t nt; >> e = atomic_read_ept_entry(&epte[i]); >> if ( e.emt == MTRR_NUM_TYPES ) >> e.emt = 0; >> @@ -542,10 +543,15 @@ static int resolve_misconfig(struct p2m_domain >> *p2m, unsigned long gfn) >> _mfn(e.mfn), 0, &ipat, >> e.sa_p2mt == >> p2m_mmio_direct); >> e.ipat = ipat; >> - if ( e.recalc && p2m_is_changeable(e.sa_p2mt) ) >> - { >> - e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + >> i, gfn + i) >> - ? p2m_ram_logdirty : p2m_ram_rw; >> + nt = p2m_recalc_type(e.recalc, e.sa_p2mt, p2m, gfn >> + i); >> + if ( nt != e.sa_p2mt ) { >> + if ( e.sa_p2mt == p2m_ioreq_server ) >> + { >> + ASSERT(p2m->ioreq.entry_count > 0); >> + p2m->ioreq.entry_count--; >> + } >> + >> + e.sa_p2mt = nt; >> ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, >> e.access); >> } >> e.recalc = 0; >> @@ -562,23 +568,24 @@ static int resolve_misconfig(struct p2m_domain >> *p2m, unsigned long gfn) >> >> if ( recalc && p2m_is_changeable(e.sa_p2mt) ) >> { >> - unsigned long mask = ~0UL << (level * >> EPT_TABLE_ORDER); >> - >> - switch ( p2m_is_logdirty_range(p2m, gfn & mask, >> - gfn | ~mask) ) >> - { >> - case 0: >> - e.sa_p2mt = p2m_ram_rw; >> - e.recalc = 0; >> - break; >> - case 1: >> - e.sa_p2mt = p2m_ram_logdirty; >> - e.recalc = 0; >> - break; >> - default: /* Force split. */ >> - emt = -1; >> - break; >> - } >> + unsigned long mask = ~0UL << (level * >> EPT_TABLE_ORDER); >> + >> + ASSERT(e.sa_p2mt != p2m_ioreq_server); >> + switch ( p2m_is_logdirty_range(p2m, gfn & mask, >> + gfn | ~mask) ) >> + { >> + case 0: >> + e.sa_p2mt = p2m_ram_rw; >> + e.recalc = 0; >> + break; >> + case 1: >> + e.sa_p2mt = p2m_ram_logdirty; >> + e.recalc = 0; >> + break; >> + default: /* Force split. */ >> + emt = -1; >> + break; >> + } > > > So here I guess only the "ASSERT(e.sa_p2mt != p2m_ioreq_server);" is > new, right? Yes, that's right -- the whole block was indented wrong; when I added the ASSERT() my editor aligned it correctly, and I couldn't very well leave it that way. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |