[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PoD code killing domain before it really gets started
>>>> On 06.08.12 at 18:03, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> >>>> wrote: >> I guess there are two problems with that: >> * As you've seen, apparently dom0 may access these pages before any >> faults happen. >> * If it happens that reclaim_single is below the only zeroed page, the >> guest will crash even when there is reclaim-able memory available. >> >> Two ways we could fix this: >> 1. Remove dom0 accesses (what on earth could be looking at a >> not-yet-created VM?) > > I'm told it's a monitoring daemon, and yes, they are intending to > adjust it to first query the GFN's type (and don't do the access > when it's not populated, yet). But wait, I didn't check the code > when I recommended this - XEN_DOMCTL_getpageframeinfo{2,3) > also call get_page_from_gfn() with P2M_ALLOC, so would also > trigger the PoD code (in -unstable at least) - Tim, was that really > a correct adjustment in 25355:974ad81bb68b? It looks to be a > 1:1 translation, but is that really necessary? If one wanted to > find out whether a page is PoD to avoid getting it populated, > how would that be done from outside the hypervisor? Would > we need XEN_DOMCTL_getpageframeinfo4 for this? > >> 2. Allocate the PoD cache before populating the p2m table >> 3. Make it so that some accesses fail w/o crashing the guest? I don't >> see how that's really practical. > > What's wrong with telling control tools that a certain page is > unpopulated (from which they will be able to imply that's it's all > clear from the guest's pov)? Even outside of the current problem, > I would think that's more efficient than allocating the page. Of > course, the control tools need to be able to cope with that. And > it may also be necessary to distinguish between read and > read/write mappings being established (and for r/w ones the > option of populating at access time rather than at creation time > would need to be explored). I wouldn't be opposed to some form of getpageframeinfo4. It's not just PoD we are talking about here. Is the page paged out? Is the page shared? Right now we have global per-domain queries (domaininfo). Or individual gfn debug memctl's. A batched interface with richer information would be a blessing for debugging or diagnosis purposes. The first order of business is exposing the type. Do we really want to expose the whole range of p2m_* types or just "really useful" ones like is_shared, is_pod, is_paged, is_normal? An argument for the former is that the mem event interface already pumps the p2m_* type up the stack. The other useful bit of information I can think of is exposing the shared ref count. My two cents Andres > >> 4. Change the sweep routine so that the lower 2MiB gets swept >> >> #2 would require us to use all PoD entries when building the p2m >> table, thus addressing the mail you mentioned from 25 July*. Given >> that you don't want #1, it seems like #2 is the best option. >> >> No matter what we do, the sweep routine for 4.2 should be re-written >> to search all of memory at least once (maybe with a timeout for >> watchdogs), since it's only called in an actual emergency. >> >> Let me take a look... > > Thanks! > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |