[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PoD code killing domain before it really gets started



>>> On 06.08.12 at 18:03, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> wrote:
> I guess there are two problems with that:
> * As you've seen, apparently dom0 may access these pages before any
> faults happen.
> * If it happens that reclaim_single is below the only zeroed page, the
> guest will crash even when there is reclaim-able memory available.
> 
> Two ways we could fix this:
> 1. Remove dom0 accesses (what on earth could be looking at a
> not-yet-created VM?)

I'm told it's a monitoring daemon, and yes, they are intending to
adjust it to first query the GFN's type (and don't do the access
when it's not populated, yet). But wait, I didn't check the code
when I recommended this - XEN_DOMCTL_getpageframeinfo{2,3)
also call get_page_from_gfn() with P2M_ALLOC, so would also
trigger the PoD code (in -unstable at least) - Tim, was that really
a correct adjustment in 25355:974ad81bb68b? It looks to be a
1:1 translation, but is that really necessary? If one wanted to
find out whether a page is PoD to avoid getting it populated,
how would that be done from outside the hypervisor? Would
we need XEN_DOMCTL_getpageframeinfo4 for this?

> 2. Allocate the PoD cache before populating the p2m table
> 3. Make it so that some accesses fail w/o crashing the guest?  I don't
> see how that's really practical.

What's wrong with telling control tools that a certain page is
unpopulated (from which they will be able to imply that's it's all
clear from the guest's pov)? Even outside of the current problem,
I would think that's more efficient than allocating the page. Of
course, the control tools need to be able to cope with that. And
it may also be necessary to distinguish between read and
read/write mappings being established (and for r/w ones the
option of populating at access time rather than at creation time
would need to be explored).

> 4. Change the sweep routine so that the lower 2MiB gets swept
> 
> #2 would require us to use all PoD entries when building the p2m
> table, thus addressing the mail you mentioned from 25 July*.  Given
> that you don't want #1, it seems like #2 is the best option.
> 
> No matter what we do, the sweep routine for 4.2 should be re-written
> to search all of memory at least once (maybe with a timeout for
> watchdogs), since it's only called in an actual emergency.
> 
> Let me take a look...

Thanks!

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.