[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] Fix libxc and pm_timer (Was: [Xen-ia64-devel] Maybe doman_destroy() was not called?)



Tue, 21 Aug 2007 09:27:45 +0900, Masaki Kanno wrote:

>Hi all,
>
>I tested xm create command with latest xen-ia64-unstable and the 
>attached patch.  The attached patch intentionally causes contiguous 
>memory shortage in VHPT allocation for HVM domain.  On the test, 
>I wanted to confirm that the release proceeding of domain resources 
>is working correctly when HVM domain creation failed.  But I could 
>not confirm that it is working correctly.  It seemed to be not 
>calling domain_destroy(). 
>The following messages are the result of the test.  Different RID 
>was allocated whenever I created a HVM domain. 
>Do you think where a bug hides? 
>
> (XEN) domain.c:546: arch_domain_create:546 domain 1 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002fd350000 hash_size 512
> (XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid
>=2000
> (XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080
> (XEN) vpd base: 0xf000000007be0000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
> (XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002f6f8c000 hash_size 512
> (XEN) regionreg.c:193: ### domain f000000004109380: rid=c0000-100000 
>mp_rid=3000
> (XEN) domain.c:583: arch_domain_create: domain=f000000004109380
> (XEN) vpd base: 0xf000000007b90000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
> (XEN) domain.c:546: arch_domain_create:546 domain 3 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002f676c000 hash_size 512
> (XEN) regionreg.c:193: ### domain f000000007bf1380: rid=100000-140000 
>mp_rid=4000
> (XEN) domain.c:583: arch_domain_create: domain=f000000007bf1380
> (XEN) vpd base: 0xf000000007b50000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
>

Hi,

I found two bugs in this problem. 

Bug.1:
 copy_from_GFW_to_nvram() in libxc forgot munmap() if NVRAM data 
 invalid.  Also it forgot free() and close() too. 
 The Bug.1 is solved by munmap_nvram_page.patch. 

I tried the test again after Bug.1 was solved.  But hypervisor did 
a panic on the test.  The following messages are the result of the 
test. 

(XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1
(XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
(XEN) tlb_track.c:115: hash 0xf0000002fad00000 hash_size 512
(XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid=2000
(XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080
(XEN) *** xen_handle_domain_access: exception table lookup failed, 
iip=0xf00000000403f530, addr=0x0, spinning...
ip=0xf00000000403f530, addr=0x0, spinning...
(XEN) d 0xf000000007c5c080 domid 0
(XEN) vcpu 0xf000000007c40000 vcpu 0
(XEN) 
(XEN) CPU 0
(XEN) psr : 0000101008226018 ifs : 800000000000058d ip  : [<f00000000403f530>]
(XEN) ip is at timer_softirq_action+0x170/0x2e0
(XEN) unat: 0000000000000000 pfs : 000000000000058d rsc : 0000000000000003
(XEN) rnat: 0000000000004000 bsps: f000000007c47e20 pr  : 00000000006a9969
(XEN) ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0  : f00000000403f4f0 b6  : f000000004038b80 b7  : a000000100018570
(XEN) f6  : 1003e000001b932157960 f7  : 1003e0000000281bd3682
(XEN) f8  : 000000000000000000000 f9  : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1  : f00000000438ca40 r2  : 0000007da3766757 r3  : f000000007c47fe8
(XEN) r8  : 0000000000000001 r9  : 0000000000000000 r10 : 0000000000000000
(XEN) r11 : 0009804c0270033f r12 : f000000007c47e00 r13 : f000000007c40000
(XEN) r14 : 0000000000000000 r15 : f0000000040fc9b0 r16 : 0000000000000001
(XEN) r17 : f000000007ceaf18 r18 : 0000000000000002 r19 : 0000000000000001
(XEN) r20 : f000000007ceb508 r21 : f0000000040fc9b8 r22 : 0000000000000001
(XEN) r23 : 0000000000000001 r24 : f000000007ceaf18 r25 : f000000007c47e28
(XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
(XEN) r29 : 0000000000000000 r30 : 0000000000000000 r31 : f000000004400100
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c41668
(XEN)  [<f000000004087640>] panic_domain+0x120/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) domain_crash_sync called from xenmisc.c:152
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) d 0xf000000007c5c080 domid 0
(XEN) vcpu 0xf000000007c40000 vcpu 0
(XEN) 
(XEN) CPU 0
(XEN) psr : 00001011085a6010 ifs : 8000000000000307 ip  : [<a0000001000a6540>]
(XEN) ip is at ???
(XEN) unat: 0000000000000000 pfs : 400000000000038a rsc : 0000000000000007
(XEN) rnat: 0000000000000000 bsps: e000000162a90f70 pr  : 00000000006a9a59
(XEN) ldrs: 0000000002300000 ccv : 0000000000000000 fpsr: 0009804c0270033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0  : a0000001000a6960 b6  : a000000100018610 b7  : a000000100018570
(XEN) f6  : 000000000000000000000 f7  : 000000000000000000000
(XEN) f8  : 000000000000000000000 f9  : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1  : a0000001011225d0 r2  : e0000001781f3154 r3  : e000000164d58198
(XEN) r8  : e000000164cc8198 r9  : e000000164cc8018 r10 : 0000000000000000
(XEN) r11 : 0000000000000000 r12 : e000000162a97df0 r13 : e000000162a90000
(XEN) r14 : 0000000000000000 r15 : e000000164cc8dd0 r16 : 0000000000001000
(XEN) r17 : e000000164d58dd0 r18 : e000000164cc8da8 r19 : 0000000000000000
(XEN) r20 : e0000001781f3138 r21 : 0000000000000018 r22 : e000000162a90f70
(XEN) r23 : 0000000000000001 r24 : 0000000000000000 r25 : 0000000000000000
(XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
(XEN) r29 : 0000000000000000 r30 : 0000000000000018 r31 : 400000000000038a
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c416b8
(XEN)  [<f000000004017300>] __domain_crash+0x100/0x140
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41690
(XEN)  [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41668
(XEN)  [<f000000004087680>] panic_domain+0x160/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c416b8
(XEN)  [<f000000004017310>] __domain_crash+0x110/0x140
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41690
(XEN)  [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41668
(XEN)  [<f000000004087680>] panic_domain+0x160/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.


Bug.2:
 The release proceeding of domain resources forgot to stop (or kill) 
 PM timer, and freed the domain structure. 
 VMX flag of VCPU#0 was not set when VHPT allocation for VCPU#0 
 failed.  For this reason, domain_relinquish_resources() did not 
 call vmx_relinqush_guest_resources().  But the domain structure 
 was freed.  As a result, timer_softirq_action() lose sight of 
 the callback function for PM timer. 
 The Bug.2 is solved by kill_pm_timer.patch. 


Signed-off-by: Masaki Kanno <kanno.masaki@xxxxxxxxxxxxxx>

Best regards,
 Kan

Attachment: kill_pm_timer.patch
Description: Binary data

Attachment: munmap_nvram_page.patch
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.