[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Possible bug/question in xen-hptool?
Hi, I was looking at using xen-hptool (tool/misc/xen-hptool.c) to make one page of a guest domain offline. I created a guest domain on Xen unstable:â # xen-mfndump dump-p2m 1 I have dom1's mfn of pfn (0x1d): pfn=0x1d ==> mfn=0x14ee17 (type 0x0)âRun `lookup-pte` to find the mfn of the pte of mfn (0x14ee17)â: # xen-mfndump lookup-pte 1 0x14ee17Â--- Lookig for PTEs mapping mfn 0x14ee17 for domain 1 --- ÂGuest Width: 8, PT Levels: 4 P2M size: = 262144  0x14ee17 <-- [0xd948e][29]: 0x1000014ee17027 âNow I use xen-hptool to make mfn (0x14ee17) offlineâ: # xen-hptool mem-offline 0x14ee17Prepare to offline MEMORY mfn 14ee17 DOM1: No suspend port, try live migration Failed to suspend guest 1 for mfn 14ee17 â(Comment: I modified the code to bypass the suspension of the dom1. I should use libxl to suspend dom1 or use the event channel to notify dom1 to suspend as the original code does. But this is not the question/issue I'm talking about here right now and I don't think this will affect the following discussion/conclusion.)â xc: error: Failure when submitting mmu updates: Internal errorxc: error: clear pte failed: Internal error Memory mfn 14ee17 offlined successfully , this page is DOM1 page yet failed to be exchanged. current state is [PG_OFFLINE_PENDING, PG_OFFLINE_OWNED] (XEN) mm.c:2004:d0v0 Error pfn d948e: rd=ffff83015d446000, od=ffff83017d8d0000 ââ , caf=8000000000000004, taf=1400000000000002(XEN) mm.c:3544:d0v0 Could not get page for normal update âI looked into the do_mmu_update() @ xen/arch/x86/mm.c, the reason why this mmu_update fails is because the owner of the page table of mfn (0x14ee17), denoted as pt_dom, is domain 0, while the owner of the page of mfn (0x14ee17) is domain 1 in do_mmu_update(). After digging into it, I found the following code confused/suspicious: Inside do_mmu_update() @ xen/arch/x86/mm.c, pt_dom is assigned by the this line: Âif ( (pt_dom = foreigndom >> 16 ) != 0 ) . However, in flush_mmu_updates() @ tools/libxc/xc_private.c, the foreigndom is assigned by the following line:Âhypercall.arg[3] =Âmmu->subject; where mmu->subject is the guest domain id of the page table. The first question is: Why should we use "foreigndom >>Â16" instead of "foreigndom" to get the pt_dom? (When a page is marked offline, we can get the domid of the page via status, using status >>ÂPG_OFFLINE_OWNER_SHIFT. But why should we left shift 16 bits again in do_mmu_update?) (I think this explains why pt_owner is treated as 0 because pt_owner was just using the default value which is the domain of current vcpu that runs the hypercall.) pt_owner is retrieved by the following line : if ( (pt_owner = rcu_lock_domain_by_id(pt_dom - 1)) == NULL )Â
My second question is: Why should we use "pt_dom - 1" instead of Â"pt_dom" here? If I set the old foreigndom (1) as (foreigndom << 16 | foreigndom) and pass the new foreigndom as the last parameter of do_mmu_update(), and change "pt_dom - 1" to "pt_dom", the xen-hptool will successfully make the mfn offline. Here is the output after issuing the command:Memory mfnÂ0x14ee17Âofflined successfully, this page is DOM1 page and being swapped successfully,Âcurrent state is [PG_OFFLINE_OFFLINED, PG_OFFLINE_OWNED] I'm wondering if this is a bug in do_mmu_update() or Âat least some inconsistence is in the do_mmu_update() code? Of course, this could also be because I misunderstood something. If so, could you please let me know what I misunderstood and how I should correct it? Thank you very much for your time! Meng
----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |