[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
> Sent: 13 April 2007 17:56
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] A different probklem with
> save/restore on C/S 14823.
> On 13/4/07 17:47, "Petersson, Mats" <Mats.Petersson@xxxxxxx> wrote:
> > See my other reply, although you may have a point about mapping - my
> > guest is running with the HVMloaders map, which probably
> maps all memory
> > available to guest linearly, including address zero (as that's where
> > real-mode puts the interrupt vector table, which can be
> useful to have
> > mapped - just a little bit ;-) ).
> > So maybe we need an earlier/different test to kill guest? Or do you
> > think this is such a critical error that hypervisor should die?
> The NULL dereference is inside the hypervisor in
> hvm_do_resume(). At that
> point you are running in Xen's address space, not the guest's. And Xen
> should have no mapping at address zero.
Yes, of course - me not thinking right - sorry [it is late-ish on a
Friday, that's my excuse and I'm sticking to it].
> The issue here is that shared_page_va is not initialised, so
> it contains 0.
> hvm_do_resume() should be getting a pointer derived from this
> value via
> get_vio(). When it dereferences it, Xen should crash. That
> didn't happen for
> you and that is scarily inexplicable.
Yes, I follow that.
However, my guest does A LOT of IOIO exits (it's an IDE test-app), with
some HLT and IRQ exits thrown in for good measure. So if the guest is
doing IOIO exit it would end up in platform.c:844 before it gets to
hvm_do_resume? Or are you saying that we should crash as soon as the
guest restarts, because that's done through hvm_do_resume?
> I suggest adding some tracing to hvm_do_resume() to find out
> whether it is
> being called at all and, if it is, what value it sets its
> local variable 'p'
> to. Also what value is in v->domain->arch.hvm-domain.shared_page_va.
Would a check for zero in get_vio() with domain_crash_synchronous() be a
"good thing" here, or is that too time-consuming in a relatively
time-critical path of HVM?
I will look at it on Monday (before I update to the new version, just to
make sure I can reproduce it still ;-) ).
> The bugs that cause this condition should all be fixed in xen-unstable
> staging tip, by the way. I just think this situation should
> be investigated
> before you upgrade in case you've uncovered another latent
> bug. Because you
> really should be crashing in hvm_do_resume() in this scenario.
> -- Keir
Xen-devel mailing list