[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] LRU list of domids
We had this irc conversation about this. c&p here for our (mostly my) reference. 11:03 <xadimgnik> Diziet: I wasn't really considering keeping 2^15 history... I was thinking much shorter... after all there's nothing to actually prevent re-creation of exactly the same domid now (e.g. create, followed by 2^15 -1 create/destroy, destroy the first one and then create) 11:04 <xadimgnik> Diziet: so keeping history is actually a new thing... so I was thinking something like a 64 entry list 11:05 <Diziet> Is there a problem with keeping 2^15 history ? My personal approach is normally to solve a problem Properly when I get my teeth into it... 11:05 <Diziet> I agree that the existing situation is as you describe but I think it is not good. 11:05 <xadimgnik> well I still don't know what we're trying to actually prevent other than a fairly immediate re-creation of a domid 11:06 <xadimgnik> so 2^15 seems like massive overkill 11:06 * Diziet thinks. 11:07 <xadimgnik> and it will make searching valid domid space when using a randomly generated id almost impossible eventually 11:08 <Diziet> The (usually latent) bug I am seeing with the current setup is that a domid might be reused while someone else still has a reference to it. 11:08 <Diziet> I'm not sure how we would demonstrate that it was actually safe to reuse a domid. Certainly as you say "fairly immediate" is obviously hazardous. 11:08 <xadimgnik> well a zombie domain will prevent re-use of that id 11:09 <Diziet> I guess we have to hope that everyone who has domids in places that would matter has to listen for @releaseDomain and if we give them all "enough time to act on that" then we are safe ? 11:09 <Diziet> xadimgnik: zombie> Indeed, but I think we have to consider actually destroyed domains too. 11:10 <xadimgnik> yes, I think prevention of immediate re-use is prudent but I'd guess a default history of e.g. 64 would probably be ok 11:10 <Diziet> So I think I am halfway to constructing an argument that a shorter recent list is OK if the list's length is longer than "the number of domains we might create in enough time for everyone to act on earlier domain destructions" ? 11:11 <xadimgnik> ok, cool... so we just need to agree on that length 11:11 <Diziet> I'm not sure how we would assess the right number. In a very disaggregated system you might well create a dozen or so domains per guest and you might have orchestration software which creates bunches of guests all at once. 11:12 <xadimgnik> one option would be leave the /libxl/<domid> path around in xenstore containing a timestamp of when the domain was destroyed 11:12 <Diziet> As for your argument about random selection becoming impossible, I was imagining you would do something like "pick randomly from the least recent 25% of the LRU list" (assuming that the list was a complete list of to-be-allocated-dynamically domids). 11:13 <Diziet> xadimgnik: And have some kind of garbage collection process which when we want to write an entry discards the "too old to care" entries ? 11:14 <xadimgnik> yes, destruction of a domain would purge 'old' entries before writing its own 11:14 <Diziet> Your timestamp idea has the virtue of having a comprehensible tuning parameter (the expiry delay) which can be safely set to a conservative value. It also means that if you run out of "safe" domids because you have a rabbit domain, or something, you get an error rather than unsafety. 11:15 <xadimgnik> yes 11:15 <Diziet> I still think some people will want to be able to reserve areas of the domid space for particular purposes, or particular hosts, or something. 11:16 <Diziet> But maybe just confining the automatically allocated domids to a [min,max> range would be sufficient. 11:16 <Diziet> And I don't think that is a necessary feature for your current work, in the sense that provided we leave room for it to be added later your series doesn't have to contain it. 11:17 <Diziet> I think the ARM folks will want it because they want to be able to have domids statically allocated in the system configuration. 11:17 <Diziet> s/ARM/embedded/ 11:17 <xadimgnik> Diziet: ok, yes I think that could be done later... I'll work on the retirement delay (for specified domids) first 11:18 <Diziet> OK so if we are going to have a reuse delay, and I think we're agreed on the data to be store in abstract terms, can we talk a bit about representation / storage location ? 11:19 <Diziet> I still think a file is a better plan than xenstore. xenstore is rather clumsier than a file for something that only one library which is part of one domain needs to care about. 11:19 <xadimgnik> Diziet: xenstore seems like the logical place... we just leave the /libxl nodes around 11:20 <Diziet> I think a few dozen /libxl/blah left over in xenstore will be an inconvenience to xenstore-ls. 11:20 <Diziet> It also makes it hard to spot leaks and wreckage (both for the human sysadmin, and for CI tools like osstest's leak detector). 11:20 <Diziet> Obviously we can teach the CI a more complicated leak rule, but the humans are harder :-). 11:20 <xadimgnik> Diziet: really, I would have thought the timestamp would be helpful 11:21 <xadimgnik> it will tell you want recent domids will be in your logs and when they were killed 11:22 <Diziet> So I think a file is easier to write the code to handle, and easier to reason about for people consuming either the normal xenstore listing or the LRU list, and uses fewer host resources. 11:22 <Diziet> So I don't understand why you think xenstore is better. 11:22 <xadimgnik> because it avoids the need to purge on boot, and it is actually easier to code up 11:22 <Diziet> (Incidentally now we have your timestamp idea there isn't a definite need to launder the list at boot, if there ever was.) 11:23 <Diziet> I'm surprised that it's easier to write. 11:23 <xadimgnik> Diziet: I find files a PITA... needing to flock etc. 11:23 <Diziet> We have all the utility functions for that in libxl. With xenstore you need the transaction loop. 11:24 <xadimgnik> Diziet: I've coded up a basic retirement list in both styles and the xenstore code is shorter 11:24 <xadimgnik> even with transactions 11:25 <Diziet> You'll need to iterate over /libxl on every domain destruction. Both of these schemes introduce an additional O( (domain churn rate)^2 ) algorithm but the constant of proportionality is much higher for xenstore. 11:25 <Diziet> xadimgnik: Cor. Can you post those ? 11:25 <xadimgnik> ok... given the use of absolute timestamps then a file is probably not so bad 11:26 <xadimgnik> Diziet: they are just keeping lists and are not what I'd call tidy 11:26 <Diziet> I think xen-init-dom0 should probably launder the list just in case of clock problems. 11:26 <Diziet> xadimgnik: I'm sure they're not. 11:26 <Diziet> That's fine :-). 11:26 <Diziet> I just want to see why I am wrote about the code size. 11:26 <Diziet> s/wrote/wrong/ 11:26 <xadimgnik> Diziet: ok, I'll dig the them out of my stash 11:26 <Diziet> If you are too embarassed you can send them to me privately... 11:27 <xadimgnik> Diziet: I was planning on it ;-) 11:27 <Diziet> I don't feel I want to put my foot down about the representation but given this conversation it would be worth seeing if anthonyper or liuw had an opinion. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |