[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] Fix libxc and pm_timer (Was: [Xen-ia64-devel] Maybe doman_destroy() was not called?)
Tue, 21 Aug 2007 09:27:45 +0900, Masaki Kanno wrote: >Hi all, > >I tested xm create command with latest xen-ia64-unstable and the >attached patch. The attached patch intentionally causes contiguous >memory shortage in VHPT allocation for HVM domain. On the test, >I wanted to confirm that the release proceeding of domain resources >is working correctly when HVM domain creation failed. But I could >not confirm that it is working correctly. It seemed to be not >calling domain_destroy(). >The following messages are the result of the test. Different RID >was allocated whenever I created a HVM domain. >Do you think where a bug hides? > > (XEN) domain.c:546: arch_domain_create:546 domain 1 pervcpu_vhpt 1 > (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256 > (XEN) tlb_track.c:115: hash 0xf0000002fd350000 hash_size 512 > (XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid >=2000 > (XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080 > (XEN) vpd base: 0xf000000007be0000, vpd size:65536 > (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt > (XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1 > (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256 > (XEN) tlb_track.c:115: hash 0xf0000002f6f8c000 hash_size 512 > (XEN) regionreg.c:193: ### domain f000000004109380: rid=c0000-100000 >mp_rid=3000 > (XEN) domain.c:583: arch_domain_create: domain=f000000004109380 > (XEN) vpd base: 0xf000000007b90000, vpd size:65536 > (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt > (XEN) domain.c:546: arch_domain_create:546 domain 3 pervcpu_vhpt 1 > (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256 > (XEN) tlb_track.c:115: hash 0xf0000002f676c000 hash_size 512 > (XEN) regionreg.c:193: ### domain f000000007bf1380: rid=100000-140000 >mp_rid=4000 > (XEN) domain.c:583: arch_domain_create: domain=f000000007bf1380 > (XEN) vpd base: 0xf000000007b50000, vpd size:65536 > (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt > Hi, I found two bugs in this problem. Bug.1: copy_from_GFW_to_nvram() in libxc forgot munmap() if NVRAM data invalid. Also it forgot free() and close() too. The Bug.1 is solved by munmap_nvram_page.patch. I tried the test again after Bug.1 was solved. But hypervisor did a panic on the test. The following messages are the result of the test. (XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1 (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256 (XEN) tlb_track.c:115: hash 0xf0000002fad00000 hash_size 512 (XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid=2000 (XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080 (XEN) *** xen_handle_domain_access: exception table lookup failed, iip=0xf00000000403f530, addr=0x0, spinning... ip=0xf00000000403f530, addr=0x0, spinning... (XEN) d 0xf000000007c5c080 domid 0 (XEN) vcpu 0xf000000007c40000 vcpu 0 (XEN) (XEN) CPU 0 (XEN) psr : 0000101008226018 ifs : 800000000000058d ip : [<f00000000403f530>] (XEN) ip is at timer_softirq_action+0x170/0x2e0 (XEN) unat: 0000000000000000 pfs : 000000000000058d rsc : 0000000000000003 (XEN) rnat: 0000000000004000 bsps: f000000007c47e20 pr : 00000000006a9969 (XEN) ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f (XEN) csd : 0000000000000000 ssd : 0000000000000000 (XEN) b0 : f00000000403f4f0 b6 : f000000004038b80 b7 : a000000100018570 (XEN) f6 : 1003e000001b932157960 f7 : 1003e0000000281bd3682 (XEN) f8 : 000000000000000000000 f9 : 000000000000000000000 (XEN) f10 : 000000000000000000000 f11 : 000000000000000000000 (XEN) r1 : f00000000438ca40 r2 : 0000007da3766757 r3 : f000000007c47fe8 (XEN) r8 : 0000000000000001 r9 : 0000000000000000 r10 : 0000000000000000 (XEN) r11 : 0009804c0270033f r12 : f000000007c47e00 r13 : f000000007c40000 (XEN) r14 : 0000000000000000 r15 : f0000000040fc9b0 r16 : 0000000000000001 (XEN) r17 : f000000007ceaf18 r18 : 0000000000000002 r19 : 0000000000000001 (XEN) r20 : f000000007ceb508 r21 : f0000000040fc9b8 r22 : 0000000000000001 (XEN) r23 : 0000000000000001 r24 : f000000007ceaf18 r25 : f000000007c47e28 (XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000 (XEN) r29 : 0000000000000000 r30 : 0000000000000000 r31 : f000000004400100 (XEN) (XEN) Call Trace: (XEN) [<f0000000040af150>] show_stack+0x80/0xa0 (XEN) sp=f000000007c478b0 bsp=f000000007c41668 (XEN) [<f000000004087640>] panic_domain+0x120/0x170 (XEN) sp=f000000007c47a80 bsp=f000000007c41600 (XEN) [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0 (XEN) sp=f000000007c47bc0 bsp=f000000007c41568 (XEN) [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300 (XEN) sp=f000000007c47c00 bsp=f000000007c41568 (XEN) [<f00000000403f530>] timer_softirq_action+0x170/0x2e0 (XEN) sp=f000000007c47e00 bsp=f000000007c41500 (XEN) [<f00000000403ca30>] do_softirq+0x170/0x220 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) domain_crash_sync called from xenmisc.c:152 (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) d 0xf000000007c5c080 domid 0 (XEN) vcpu 0xf000000007c40000 vcpu 0 (XEN) (XEN) CPU 0 (XEN) psr : 00001011085a6010 ifs : 8000000000000307 ip : [<a0000001000a6540>] (XEN) ip is at ??? (XEN) unat: 0000000000000000 pfs : 400000000000038a rsc : 0000000000000007 (XEN) rnat: 0000000000000000 bsps: e000000162a90f70 pr : 00000000006a9a59 (XEN) ldrs: 0000000002300000 ccv : 0000000000000000 fpsr: 0009804c0270033f (XEN) csd : 0000000000000000 ssd : 0000000000000000 (XEN) b0 : a0000001000a6960 b6 : a000000100018610 b7 : a000000100018570 (XEN) f6 : 000000000000000000000 f7 : 000000000000000000000 (XEN) f8 : 000000000000000000000 f9 : 000000000000000000000 (XEN) f10 : 000000000000000000000 f11 : 000000000000000000000 (XEN) r1 : a0000001011225d0 r2 : e0000001781f3154 r3 : e000000164d58198 (XEN) r8 : e000000164cc8198 r9 : e000000164cc8018 r10 : 0000000000000000 (XEN) r11 : 0000000000000000 r12 : e000000162a97df0 r13 : e000000162a90000 (XEN) r14 : 0000000000000000 r15 : e000000164cc8dd0 r16 : 0000000000001000 (XEN) r17 : e000000164d58dd0 r18 : e000000164cc8da8 r19 : 0000000000000000 (XEN) r20 : e0000001781f3138 r21 : 0000000000000018 r22 : e000000162a90f70 (XEN) r23 : 0000000000000001 r24 : 0000000000000000 r25 : 0000000000000000 (XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000 (XEN) r29 : 0000000000000000 r30 : 0000000000000018 r31 : 400000000000038a (XEN) (XEN) Call Trace: (XEN) [<f0000000040af150>] show_stack+0x80/0xa0 (XEN) sp=f000000007c478b0 bsp=f000000007c416b8 (XEN) [<f000000004017300>] __domain_crash+0x100/0x140 (XEN) sp=f000000007c47a80 bsp=f000000007c41690 (XEN) [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0 (XEN) sp=f000000007c47a80 bsp=f000000007c41668 (XEN) [<f000000004087680>] panic_domain+0x160/0x170 (XEN) sp=f000000007c47a80 bsp=f000000007c41600 (XEN) [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0 (XEN) sp=f000000007c47bc0 bsp=f000000007c41568 (XEN) [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300 (XEN) sp=f000000007c47c00 bsp=f000000007c41568 (XEN) [<f00000000403f530>] timer_softirq_action+0x170/0x2e0 (XEN) sp=f000000007c47e00 bsp=f000000007c41500 (XEN) [<f00000000403ca30>] do_softirq+0x170/0x220 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) (XEN) Call Trace: (XEN) [<f0000000040af150>] show_stack+0x80/0xa0 (XEN) sp=f000000007c478b0 bsp=f000000007c416b8 (XEN) [<f000000004017310>] __domain_crash+0x110/0x140 (XEN) sp=f000000007c47a80 bsp=f000000007c41690 (XEN) [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0 (XEN) sp=f000000007c47a80 bsp=f000000007c41668 (XEN) [<f000000004087680>] panic_domain+0x160/0x170 (XEN) sp=f000000007c47a80 bsp=f000000007c41600 (XEN) [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0 (XEN) sp=f000000007c47bc0 bsp=f000000007c41568 (XEN) [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300 (XEN) sp=f000000007c47c00 bsp=f000000007c41568 (XEN) [<f00000000403f530>] timer_softirq_action+0x170/0x2e0 (XEN) sp=f000000007c47e00 bsp=f000000007c41500 (XEN) [<f00000000403ca30>] do_softirq+0x170/0x220 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300 (XEN) sp=f000000007c47e00 bsp=f000000007c41480 (XEN) Domain 0 crashed: rebooting machine in 5 seconds. Bug.2: The release proceeding of domain resources forgot to stop (or kill) PM timer, and freed the domain structure. VMX flag of VCPU#0 was not set when VHPT allocation for VCPU#0 failed. For this reason, domain_relinquish_resources() did not call vmx_relinqush_guest_resources(). But the domain structure was freed. As a result, timer_softirq_action() lose sight of the callback function for PM timer. The Bug.2 is solved by kill_pm_timer.patch. Signed-off-by: Masaki Kanno <kanno.masaki@xxxxxxxxxxxxxx> Best regards, Kan Attachment:
kill_pm_timer.patch Attachment:
munmap_nvram_page.patch _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |