|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 04/34] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration
On Mon, 2026-06-15 at 23:47 -0700, Dongli Zhang wrote: > I tested patches 02, 03, 04, and 26 by customizing QEMU to support kexec live > updates (LUO and KHO), preserving the memfd across kexec. Thank you. > For my use case, I used KVM_[GS]ET_CLOCK_GUEST instead of the existing > KVM_[GS]ET_CLOCK. I didn't account the downtime in my QEMU code, although host > TSC never resets across kexec. > > Clock drift was zero, and I did not observe any unnecessary master clock > updates > after KVM_SET_CLOCK_GUEST completed. The kvmclock drift won't have been *zero*; it will have been a nanosecond or two. Which most people won't notice, but is annoying me. It believe it comes from both pvclock_update_vm_gtod_copy() and kvm_vcpu_ioctl_set_clock_guest() rounding *down*. I think we should tweak the latter to round *up* so they're at least not biasing in the same direction. We could also do better at picking a snapshot cycle count which *doesn't* lose in the rounding. But those are definitely improvements for another day; this series is long and complex enough and has already gained a dependency on fixes in core timekeeping snapshots. > Another interesting observation from my experiments is that tsc_khz changes > across kexec. Since the TSC value itself does not reset across kexec, I'm > wondering whether there is any reason to switch to the new tsc_khz value after > the kexec. This is the host timekeeping, yes? We really ought to pass over *all* the NTP synchronization data across KHO — not just the frequency. There's no excuse for the new kernel not reporting *precisely* the same time that the old kernel would, for a given TSC reading. The work I've been doing at https://git.infradead.org/?p=users/dwmw2/linux.git;a=shortlog;h=refs/heads/ffclock lays the groundwork for exporting and importing the full reference data, and maybe I should use KHO as the example use case while we continue to bikeshed the userspace and vmclock parts. > While live migration involves two different machines, kexec is performed on > the > same machine. Given that the TSC value itself is preserved across kexec, would > it make sense to reuse the pre-kexec tsc_khz value instead of using the new > tsc_khz after kexec? > > I tested this by using LUO to preserve tsc_khz across kexec, and the results > looked good. Of course, what we should really be doing is exporting the timekeeping reference to see what frequency the source host TSC is *actually* running at, at the time of migration. That gives us a function of guest TSC to TAI. Then we can restore the TSC on the destination host as if it has been running at precisely that frequency during the migration. The TSC might be at a slightly different frequency on the new host, but we provide vmclock and the guest can clamp its timekeeping to that fairly much immediately (see qemu patch I've been posting with the ffclock/timekeeping series). Attachment:
smime.p7s
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |