|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 01/11] xen/manage: keep track of the on-going suspend mode
On Wed, May 26, 2021 at 02:29:53PM -0400, Boris Ostrovsky wrote:
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you can confirm the sender and know the
> content is safe.
>
>
>
> On 5/26/21 12:40 AM, Anchal Agarwal wrote:
> > On Tue, May 25, 2021 at 06:23:35PM -0400, Boris Ostrovsky wrote:
> >> CAUTION: This email originated from outside of the organization. Do not
> >> click links or open attachments unless you can confirm the sender and know
> >> the content is safe.
> >>
> >>
> >>
> >> On 5/21/21 1:26 AM, Anchal Agarwal wrote:
> >>>>> What I meant there wrt VCPU info was that VCPU info is not unregistered
> >>>>> during hibernation,
> >>>>> so Xen still remembers the old physical addresses for the VCPU
> >>>>> information, created by the
> >>>>> booting kernel. But since the hibernation kernel may have different
> >>>>> physical
> >>>>> addresses for VCPU info and if mismatch happens, it may cause issues
> >>>>> with resume.
> >>>>> During hibernation, the VCPU info register hypercall is not invoked
> >>>>> again.
> >>>> I still don't think that's the cause but it's certainly worth having a
> >>>> look.
> >>>>
> >>> Hi Boris,
> >>> Apologies for picking this up after last year.
> >>> I did some dive deep on the above statement and that is indeed the case
> >>> that's happening.
> >>> I did some debugging around KASLR and hibernation using reboot mode.
> >>> I observed in my debug prints that whenever vcpu_info* address for
> >>> secondary vcpu assigned
> >>> in xen_vcpu_setup at boot is different than what is in the image, resume
> >>> gets stuck for that vcpu
> >>> in bringup_cpu(). That means we have different addresses for
> >>> &per_cpu(xen_vcpu_info, cpu) at boot and after
> >>> control jumps into the image.
> >>>
> >>> I failed to get any prints after it got stuck in bringup_cpu() and
> >>> I do not have an option to send a sysrq signal to the guest or rather get
> >>> a kdump.
> >>
> >> xenctx and xen-hvmctx might be helpful.
> >>
> >>
> >>> This change is not observed in every hibernate-resume cycle. I am not
> >>> sure if this is a bug or an
> >>> expected behavior.
> >>> Also, I am contemplating the idea that it may be a bug in xen code
> >>> getting triggered only when
> >>> KASLR is enabled but I do not have substantial data to prove that.
> >>> Is this a coincidence that this always happens for 1st vcpu?
> >>> Moreover, since hypervisor is not aware that guest is hibernated and it
> >>> looks like a regular shutdown to dom0 during reboot mode,
> >>> will re-registering vcpu_info for secondary vcpu's even plausible?
> >>
> >> I think I am missing how this is supposed to work (maybe we've talked
> >> about this but it's been many months since then). You hibernate the guest
> >> and it writes the state to swap. The guest is then shut down? And what's
> >> next? How do you wake it up?
> >>
> >>
> >> -boris
> >>
> > To resume a guest, guest boots up as the fresh guest and then
> > software_resume()
> > is called which if finds a stored hibernation image, quiesces the devices
> > and loads
> > the memory contents from the image. The control then transfers to the
> > targeted kernel.
> > This further disables non boot cpus,sycore_suspend/resume callbacks are
> > invoked which sets up
> > the shared_info, pvclock, grant tables etc. Since the vcpu_info pointer for
> > each
> > non-boot cpu is already registered, the hypercall does not happen again when
> > bringing up the non boot cpus. This leads to inconsistencies as pointed
> > out earlier when KASLR is enabled.
>
>
> I'd think the 'if' condition in the code fragment below should always fail
> since hypervisor is creating new guest, resulting in the hypercall. Just like
> in the case of save/restore.
>
That only fails during boot but not after the control jumps into the image. The
non boot cpus are brought offline(freeze_secondary_cpus) and then online via
cpu hotplug path. In that case xen_vcpu_setup doesn't invokes the hypercall
again.
>
> Do you call xen_vcpu_info_reset() on resume? That will re-initialize
> per_cpu(xen_vcpu). Maybe you need to add this to xen_syscore_resume().
>
Yes coincidentally I did. It fails the registration of vcpu_info with error -22.
Basically because nobody unregistered them and xen does not know that guest
hibernated
in first place.
Moreover, syscore_resume is also called during hibernation path i.e after Image
is
created. Everything is resumed and thawed back before final writing of the image
and then a machine shutdown. So syscore_resume can only invoke
xen_vcpu_info_reset
when it is actually resuming from image. I had ben able to use in_suspend
variable to detect that luckily.
Another line of thought is something what kexec does to come around this problem
is to abuse soft_reset and issue it during syscore_resume or may be before the
image get loaded.
I haven't experimented with that yet as I am assuming there has to be a way to
re-register vcpus during resume.
Thanks,
Anchal
>
> -boris
>
>
> >
> > Thanks,
> > Anchal
> >>
> >>> I could definitely use some advice to debug this further.
> >>>
> >>>
> >>> Some printk's from my debugging:
> >>>
> >>> At Boot:
> >>>
> >>> xen_vcpu_setup: xen_have_vcpu_info_placement=1 cpu=1,
> >>> vcpup=0xffff9e548fa560e0, info.mfn=3996246 info.offset=224,
> >>>
> >>> Image Loads:
> >>> It ends up in the condition:
> >>> xen_vcpu_setup()
> >>> {
> >>> ...
> >>> if (xen_hvm_domain()) {
> >>> if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
> >>> return 0;
> >>> }
> >>> ...
> >>> }
> >>>
> >>> xen_vcpu_setup: checking mfn on resume cpu=1, info.mfn=3934806
> >>> info.offset=224, &per_cpu(xen_vcpu_info, cpu)=0xffff9d7240a560e0
> >>>
> >>> This is tested on c4.2xlarge [8vcpu 15GB mem] instance with 5.10 kernel
> >>> running
> >>> in the guest.
> >>>
> >>> Thanks,
> >>> Anchal.
> >>>> -boris
> >>>>
> >>>>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |