Xen project Mailing List

Re: [Xen-devel] [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access

From: "Aravindh Puthiyaparambil (aravindp)" <aravindp@xxxxxxxxx>

Date: Fri, 2 May 2014 16:28:11 +0000

Accept-language: en-US

Cc: Keir Fraser <keir@xxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, "Andres Lagar-Cavilla \(andreslc@xxxxxxxxxxxxxx\)" <andreslc@xxxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

Delivery-date: Fri, 02 May 2014 16:28:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHPY4gXK7st0m8H50K6R/LPk4OUmZspLKmAgADcA4CAAKv8cIACmJkAgAAxRmA=

Thread-topic: [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access

>>>> On 01.05.14 at 00:20, <aravindp@xxxxxxxxx> wrote: >>>I think implying _n to mean "default" is at least a latent bug anyway >>>(since that precludes setting that type). And _if_ "get" really does >>>any modification, that minimally would require a big comment. >> >> For PV domains access_n is precluded. >> >>>Furthermore, with it being possible for the default type to change, >>>the net effect might easily be different between setting the default >>>type at initialization time or when the first lookup happens. >> >> I am not sure that I follow your concern here. Setting default access >> does not mean that the access table gets cleared. So if a particular >> MFN entry's access value changed and the default value was changed, >> then the MFN should retain the access value it was changed to and not >> the default value. The new default value will only be returned for >> MFNs for which set_entry was not called on and I think that is correct. > >It might still be a difference when the designated access gets effected: >I would assume that a request for the default type should obtain the default >set at the time of the request, not the one in effects at the time the entry >gets first used. Ah I understand now. Given that I am moving away from using the lookup table, this might not be a concern anymore. >>>>>> +int p2m_mem_access_init(struct p2m_domain *p2m) { >>>>>> + struct domain *d = p2m->domain; >>>>>> + mfn_t *access_lookup_table; >>>>>> + uint32_t nr_access_table_pages; >>>>>> + uint32_t ctr; >>>>>> + >>>>>> + nr_access_table_pages = get_domain_nr_access_table_pages(d); >>>>>> + access_lookup_table = xzalloc_array(mfn_t, >>>>>> + nr_access_table_pages); >>>>> >>>>>This surely is going to be an order > 0 allocation, and you even >>>>>risk it being an order > MAX_ORDER one. The former is disallowed at >>>>>runtime by convention/agreement, the latter is an outright bug. >>>> >>>> Sorry, I was not aware of the convention. What should I be doing >>>> here instead? >>> >>>Especially with the above note about GFN space being effectively >>>unbounded for PV guests, you need to find a different data structure >>>to represent the information you need. Perhaps something similar to >>>what log-dirty mode uses, except that you would need more levels in >>>the extreme case, and hence it might be worthwhile making the actually >used number of levels dynamic? >> >> I am a little confused here. In my initial discussions about this I >> was told that PV guests have bounded and contiguous memory. This is >> why I took the approach of an indexable array. >> >> " OTOH, all you need is a byte per pfn, and the great thing is that in >> PV domains, the physmap is bounded and continuous. Unlike HVM and its >> PCI holes, etc, which demand the sparse tree structure. So you can >> allocate an easily indexable array, notwithstanding super page concerns (I >think/hope)." >> >> http://lists.xenproject.org/archives/html/xen-devel/2013-11/msg03860.h >> tml >> >> Did I misunderstand something here? > >What he stated is the situation at domain start. Due to ballooning and other >kinds of manipulations (as said, pv-ops re-arranges things to avoid having >memory [PFNs] overlap MMIO [MFNs]) this may change immediately after >the domain started. > >And even if it was a plain contiguous linear address space, you still wouldn't >be >permitted to allocate a flat one-byte-per-page array, as for large domains this >might still overflow the MAX_ORDER allocation constraint. OK, the flat array is a bad idea. I am looking in to some of Tim's suggestions to do this now. Either stash the values in the shadow_flags or reuse the p2m-pt implementation. I am going to try reusing the shadows_flags first and if that does not work I will go with the p2m-pt implementation. >>>>>> @@ -2443,7 +2443,8 @@ int sh_remove_all_mappings(struct vcpu *v, >>>>>mfn_t gmfn) >>>>>> if ( !(shadow_mode_external(v->domain) >>>>>> && (page->count_info & PGC_count_mask) <= 3 >>>>>> && ((page->u.inuse.type_info & PGT_count_mask) >>>>>> - == !!is_xen_heap_page(page))) ) >>>>>> + == !!is_xen_heap_page(page))) >>>>>> + && !(shadow_mode_mem_access(v->domain)) ) >>>>> >>>>>You're breaking indentation, there are pointless parentheses again, >>>>>but most importantly - why? >>>> >>>> Sorry, I meant to ask a question about this in my patch message and >>>> mark this as a workaround. I am seeing the message on all >>>> sh_remove_all_mappings() and I was not able to figure out why this >>>> was happening. I just added this as a work around. I was hoping you >>>> or Tim >>>would shed more light on this. >>> >>>I'm afraid that without detail on which pages the triggers on, and you >>>at >> least >>>having spent some time finding out where the stray/extra references >>>may be coming from it's going to be hard to help. >> >> Here are some of the prints I saw. I typically saw it for every >> set_entry() call. > >There are some rather large usage counts among the examples you gave - >these surely need understanding rather than blindly ignoring. Tim gave me the reason for this: "This is because you are not in shadow_mode_refcounts() (i.e. the page's refcount and typecount are based on the _guest_ pagetables rather than the _shadow_ ones). That means that sh_remove_all_mappings() can't use the refcounts to figure out when it's removed the last shadow mapping." Thanks, Aravindh _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.