[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/2] Time-related fixes for migration



> From: Boris Ostrovsky [mailto:boris.ostrovsky@xxxxxxxxxx]
> Sent: Sunday, March 30, 2014 11:06 AM
> 
> Two patches to address time-related problem that we discovered during
> migration testing.
> 
> * The first patch loads HVM parameters from configuration file during
> restore.
> To fix the actual problem that we saw only timer_mode needed to be restored
> but
> it seems to me that other parameters are needed as well since at least some
> of
> them are used at runtime.
> 
> The bug can be demonstrated with a Solaris guest but I haven't been able to
> trigger it with Linux. Possibly because Solaris' gethrtime() routine (which is
> what the test was using) is a trap to kernel's hrtimer which reports global
> time and performs some adjustments to per-CPU clock.

this looks a right thing to fix.

> 
> * The second patch keeps TSCs synchronized across VPCUs after save/restore.
> Currently TSC values diverge after migration because during both save and
> restore
> we calculate them separately for each VCPU and base each calculation on
> newly-read host's TSC.
> 
> The problem can be easily demonstrated with this program for a 2-VCPU guest
> (I am assuming here invariant TSC so, for example,
> tsc_mode="always_emulate" (*)):
> 
> int
> main(int argc, char* argv[])
> {
> 
>   unsigned long long h = 0LL;
>   int proc = 0;
>   cpu_set_t set;
> 
>   for(;;) {
>     unsigned long long n = __native_read_tsc();
>     if(h && n < h)
>         printf("prev 0x%llx cur 0x%llx\n", h, n);
>     CPU_ZERO(&set);
>     proc = (proc + 1) & 1;
>     CPU_SET(proc, &set);
>     if (sched_setaffinity(0, sizeof(cpu_set_t), &set)) {
>         perror("sched_setaffinity");
>         exit(1);
>     }
> 
>     h = n;
>   }
> }
> 

what's the backward drift range from above program? dozens of cycles?
hundreds of cycles?

> 
> (*) Which brings up another observation: when we are in default tsc_mode we
> start off with vtsc=0 and thus clear TSC_Invariant bit in guest's CPUID.
> After migration vtsc is 1 and TSC_Invariant bit is set. So the guest may 
> observe
> different values of CPUID. Which technically reflects the fact that TSC became
> "safe" but I think potentially may be problematic to some guests.
> 
> 
> Boris Ostrovsky (2):
>   libxl: Set guest parameters from config file during a restore
>   x86/HVM: Use fixed TSC value when saving or restoring domain
> 
>  tools/libxl/libxl_dom.c       |   51
> +++++++++++++++++++++++++----------------
>  xen/arch/x86/hvm/hvm.c        |   18 ++++++++++-----
>  xen/arch/x86/hvm/save.c       |   36 +++++++++++++++++++++--------
>  xen/arch/x86/hvm/svm/svm.c    |    4 ++--
>  xen/arch/x86/hvm/vmx/vmx.c    |    4 ++--
>  xen/arch/x86/hvm/vpt.c        |   16 ++++++++-----
>  xen/arch/x86/time.c           |    7 ++++--
>  xen/common/hvm/save.c         |    5 ++++
>  xen/include/asm-x86/domain.h  |    2 ++
>  xen/include/asm-x86/hvm/hvm.h |    9 +++++---
>  xen/include/xen/hvm/save.h    |    2 ++
>  xen/include/xen/time.h        |    3 ++-
>  12 files changed, 105 insertions(+), 52 deletions(-)
> 
> --
> 1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.