[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-ia64-devel] [PATCH] pal_halt_light emulate for domU TAKE3



This is a pal_halt_light emulation for domU TAKE3.

Status

I tested and passed 16h 4CPU DomUx2 continuous Linux Kernel Compile
using this patch.
To pass above test, I mainly changed 2 points(from previous TAKE2).
If I test Previous TAKE2 with same configuration,
DomU make a memory leak after 2-3 hours running.


Changes from Previous Take2 (2points):

1)for set_timer/do_block correctness(for vcpu_migration case)
  it add timer migrates for vcpu migrates

  Add migrate timer for hlt_timer_fn in context_switch.

2)for VIRQ_ITC behavior correctness(for vcpu_migration case)
  Change the order of  
  send_guest_vcpu_virq(vcpu, VIRQ_ITC) 
  AND
  PSCBX(vcpu, domain_itm_last) = PSCBX(vcpu, domain_itm)
  in vcpu_pend_timer

These changes affects only for domU/0.


Reason for changes.
1)Previous TAKE2 Patch it is not included migrate timer.
  The hlt_timer should be in same pcpu as vcpu allocated.


2)The VIRQ_ITC cycle is destroyed 
  in case vcpu migration occurred when vcpu_pend_timer runs (called by 
hlt_timer_fn)
  
  The VIRQ_ITC cycle is (simple writing)
   ia64_get_itc reaches domain_itm at Xen.
   send VIRQ_ITC to GuestOS
   GuestOS handles and set  next itm by hypercall to Xen
   Repeats.
   
  
  Currently, the function order in vcpu_pend_timer is 
  a)send VIRQ_ITC(vcpu_pend_timer)
  b)then stamp the signal send information.(update domain_itm_last)
  In this configuration, 
  domain_itm_last update time and hypercall set itm(which comes from VIRQ_ITC)
  is important. 
  
  This order is problematic in case vcpu_migration occurred as follows
  (following event occures nearly 1-2 hour running 
  by 2xDomU 3vcpu(total 6vcpu for domU) in 4pcpu Linux Kernel Compile test)
  
   vcpu_pend_timer@A is started.(GuestOS is not running at this moment) 
   interrupts@B
   vcpu_pend_timer@A is paused
   Credit Scheduler steal vcpu from A to B
   vcpu_pend_timer@B is started and ended
   
   ******The following process should be after vcpu_pend_timer*****
   domain_itm is set by GuestOS hypercall. and domain_itm updated.
   
   vcpu_pend_timer@A is resumed. but domain_itm is already updated.

   
  To avoid this, order should be exchanged b) a)
  After this fixation, we are free from timing consideration of 
  VIRQ_ITC cycle and domain_itm_last update time.
  (domain_itm_last update should be before the hypercall set itm)

Comments for TIMER_SLOP
 
 The Anthony suggested parameter (TIMER_SLOP) is not used in x86.
So I am not used.

Comments for Note(for 2)

  After Yamahata's report(domU hung), I tested many times.
Every time, Just memory error(oom-killer) on DomU is detected(not hunged).
So I try to solve this problem,
Some body suggests me this kind of error occurred in case 
CPU is running but Timer is stopped

Because RCU(Read Copy Update) in Linux garbage collection occurred by timer.
But memory allocation continues during CPU running.
(it makes oom-killer running state.)

I checked the behavior of vcpu and timer and found the timing problem in 
vcpu_pend_timer.

References for this modification
(for 2)
about RCU(Read Copy Update) 
http://en.wikipedia.org/wiki/Read-copy-update

(Other fixes)
Yamahata's for VTI migrate timer.
http://lists.xensource.com/archives/html/xen-ia64-devel/2006-07/msg00375.html
stop_timer
http://lists.xensource.com/archives/html/xen-ia64-devel/2006-07/msg00171.html


Signed-off-by: Atsushi SAKAI <sakaia@xxxxxxxxxxxxxx>








Attachment: pal_halt_light_emulate_take3.patch
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.