[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v12 5/6] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries.
>>> On 07.04.17 at 12:22, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote: > > On 4/7/2017 6:22 PM, George Dunlap wrote: >> On 07/04/17 10:53, Yu Zhang wrote: >>> >>> On 4/7/2017 5:40 PM, Jan Beulich wrote: >>>>>>> On 06.04.17 at 17:53, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote: >>>>> --- a/xen/arch/x86/mm/p2m-ept.c >>>>> +++ b/xen/arch/x86/mm/p2m-ept.c >>>>> @@ -544,6 +544,12 @@ static int resolve_misconfig(struct p2m_domain >>>>> *p2m, unsigned long gfn) >>>>> e.ipat = ipat; >>>>> if ( e.recalc && p2m_is_changeable(e.sa_p2mt) ) >>>>> { >>>>> + if ( e.sa_p2mt == p2m_ioreq_server ) >>>>> + { >>>>> + ASSERT(p2m->ioreq.entry_count > 0); >>>>> + p2m->ioreq.entry_count--; >>>>> + } >>>>> + >>>>> e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn >>>>> + i, gfn + i) >>>>> ? p2m_ram_logdirty : p2m_ram_rw; >>>> I don't think this can be right: Why would it be valid to change the >>>> type from p2m_ioreq_server to p2m_ram_rw (or p2m_ram_logdirty) >>>> here, without taking into account further information? This code >>>> can run at any time, not just when you want to reset things. So at >>>> the very least there is a check missing whether a suitable ioreq >>>> server still exists (and only if it doesn't you want to do the type >>>> reset). >>> Sorry, Jan. I think we have discussed this quite long ago. >>> Indeed, there's information lacked here, and that's why global_logdirty >>> is disallowed >>> when there's remaining p2m_ioreq_server entries. :-) >>> >>>>> @@ -816,6 +822,22 @@ ept_set_entry(struct p2m_domain *p2m, unsigned >>>>> long gfn, mfn_t mfn, >>>>> new_entry.suppress_ve = is_epte_valid(&old_entry) ? >>>>> old_entry.suppress_ve : 1; >>>>> + /* >>>>> + * p2m_ioreq_server is only used for 4K pages, so the >>>>> + * count shall only happen on ept page table entries. >>>>> + */ >>>>> + if ( p2mt == p2m_ioreq_server ) >>>>> + { >>>>> + ASSERT(i == 0); >>>>> + p2m->ioreq.entry_count++; >>>>> + } >>>>> + >>>>> + if ( ept_entry->sa_p2mt == p2m_ioreq_server ) >>>>> + { >>>>> + ASSERT(p2m->ioreq.entry_count > 0 && i == 0); >>>> I think this would better be two ASSERT()s, so if one triggers it's >>>> clear what problem it was right away. The two conditions aren't >>>> really related to one another. >>>> >>>>> @@ -965,7 +987,7 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m, >>>>> if ( is_epte_valid(ept_entry) ) >>>>> { >>>>> if ( (recalc || ept_entry->recalc) && >>>>> - p2m_is_changeable(ept_entry->sa_p2mt) ) >>>>> + p2m_check_changeable(ept_entry->sa_p2mt) ) >>>> I think the distinction between these two is rather arbitrary, and I >>>> also think this is part of the problem above: Distinguishing log-dirty >>>> from ram-rw requires auxiliary data to be consulted. The same >>>> ought to apply to ioreq-server, and then there wouldn't be a need >>>> to have two p2m_*_changeable() flavors. >>> Well, I think we have also discussed this quite long ago, here is the link. >>> https://lists.xen.org/archives/html/xen-devel/2016-09/msg01017.html >>> >>>> Of course the subsequent use p2m_is_logdirty_range() may then >>>> need amending. >>>> >>>> In the end it looks like you have the inverse problem here compared >>>> to above: You should return ram-rw when the reset was already >>>> initiated. At least that's how I would see the logic to match up with >>>> the log-dirty handling (where the _effective_ rather than the last >>>> stored type is being returned). >>>> >>>>> @@ -606,6 +615,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned >>>>> long gfn, mfn_t mfn, >>>>> if ( page_order == PAGE_ORDER_4K ) >>>>> { >>>>> + p2m_type_t p2mt_old; >>>>> + >>>>> rc = p2m_next_level(p2m, &table, &gfn_remainder, gfn, >>>>> L2_PAGETABLE_SHIFT - PAGE_SHIFT, >>>>> L2_PAGETABLE_ENTRIES, >>>>> PGT_l1_page_table, 1); >>>>> @@ -629,6 +640,21 @@ p2m_pt_set_entry(struct p2m_domain *p2m, >>>>> unsigned long gfn, mfn_t mfn, >>>>> if ( entry_content.l1 != 0 ) >>>>> p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags); >>>>> + p2mt_old = p2m_flags_to_type(l1e_get_flags(*p2m_entry)); >>>>> + >>>>> + /* >>>>> + * p2m_ioreq_server is only used for 4K pages, so >>>>> + * the count shall only be performed for level 1 entries. >>>>> + */ >>>>> + if ( p2mt == p2m_ioreq_server ) >>>>> + p2m->ioreq.entry_count++; >>>>> + >>>>> + if ( p2mt_old == p2m_ioreq_server ) >>>>> + { >>>>> + ASSERT(p2m->ioreq.entry_count > 0); >>>>> + p2m->ioreq.entry_count--; >>>>> + } >>>>> + >>>>> /* level 1 entry */ >>>>> p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1); >>>> I think to match up with EPT you also want to add >>>> >>>> ASSERT(p2mt_old != p2m_ioreq_server); >>>> >>>> to the 2M and 1G paths. >>> Is this really necessary? 2M and 1G page does not have p2mt_old, >>> defining one and peek the p2m type just >>> to have an ASSERT does not seem quite useful - and will hurt the >>> performance. >>> >>> As to ept, since there's already a variable 'i', which may be greater >>> than 0 - so I added an ASSERT. >> Yes, that's Jan's point -- that for EPT, there is effectively ASSERT() >> that 2M and 1G entries are not p2m_ioreq_server; but for SVM, because of >> the code duplication, there is not. >> >> ASSERT()s are: >> 1. There to double-check that the assumptions you're making (i.e., "2M >> and 1G entries can never be of type p2m_ioreq_server") are valid >> 2. Only enabled when debug=y, and so are generally not a performance >> consideration. >> >> You're making an assumption, so an ASSERT is useful; and it's only a >> one-line check that will be removed for non-debug builds, so the >> performance is not a consideration. > > Thanks George. > I do not worry about the cost of the ASSERT() itself, but the effort of > peeking a p2m: > p2m_flags_to_type(l1e_get_flags(*p2m_entry)); > And this cannot be removed during runtime. The l1e_get_flags() is not needed - both paths latch that into "flags" already. And p2m_flags_to_type() is a simple inline function, for which the compiler should be able to see that its result is not used, and hence all code generated from it can be deleted. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |