[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: mem-event interface
At 23:25 +0100 on 23 Jun (1277335526), Grzegorz Milos wrote: > However, I'm a bit wary about putting anything non-essential in libxc, > and it seems like the event demux might be quite complex and dependant > on the type of events you are handling. Therefore we don't want to end > up with really complex daemon in libxc. Instead I think we should try > to make use of multiple rings in order to alleviate some of the demux > headaches (sharing related events would go to the memshr daemon > through one ring, paging to the pager through another, introspection > events to XenAccess etc.), and then do further demux in the relevant > daemon. I agree that multiple rings are a good idea here - especially if we want to disaggregate and have event handlers in multiple domains. Maybe the ring-registering interface could take a type and a rangeset - that would reduce the amount of extra chatter at the cost of some more overhead in Xen. > This could potentially introduce some inefficiencies (e.g. one memory > access could generate multiple events), and could cause the daemons to > step on each other toes, but I don't think that's going to be a > problem in practice, because the types of events we are interested in > intercepting at the moment seem to be disjoint enough. > > Also, the complexity of handling sync vs. async events, as well as > supporting batching and out-of-order replies, may already be complex > enough without having to worry about demultiplexing ;). So let's do > things in small steps. I think the priority should be teaching Xen to > handle multiple rings (the last time I looked at the mem_event code it > couldn't). What do you think? > > Thanks > Gregor > > > On Wed, Jun 23, 2010 at 11:25 PM, Grzegorz Milos > <grzegorz.milos@xxxxxxxxx> wrote: > > [From Patrick] > > > > Ah. Well, as long as it's in it's own library or API or whatever so > > other applications can take advantage of it, then it's fine by me :) > > libintrospec or something like that. > > > > > > Patrick > > > > > > On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos > > <grzegorz.milos@xxxxxxxxx> wrote: > >> [From Bryan] > >> > >>> I guess I'm more envisioning integrating all this with libxc and > >>> having XenAccess et al. use that. Keeping it as a separate, VM > >>> introspection library makes sense too. In any case, I think having > >>> XenAccess as part of Xen is a good move. VM introspection is a useful > >>> thing to have and I think a lot of projects could benefit from it. > >> > >> From my experience, the address translations can actually be pretty > >> tricky. This is a big chunk of what XenAccess does, and it requires > >> some memory analysis in the domU to find necessary page tables and > >> such. So it may be more than you really want to add to libxc. But if > >> you go down this route, then I could certainly simplify the XenAccess > >> code, so I wouldn't complain about that :-) > >> > >> -bryan > >> > >> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos > >> <grzegorz.milos@xxxxxxxxx> wrote: > >>> [From Patrick] > >>> > >>> I guess I'm more envisioning integrating all this with libxc and > >>> having XenAccess et al. use that. Keeping it as a separate, VM > >>> introspection library makes sense too. In any case, I think having > >>> XenAccess as part of Xen is a good move. VM introspection is a useful > >>> thing to have and I think a lot of projects could benefit from it. > >>> > >>> > >>> Patrick > >>> > >>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos > >>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>> [From Bryan] > >>>> > >>>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn > >>>>> translation code out into the library and have the mem_event daemon > >>>>> use that? I do remember reading through and borrowing XenAccess code > >>>> > >>>> This is certainly doable. But if we decide to make a Xen library > >>>> depend on XenAccess, then it would make sense to include XenAccess as > >>>> part of the Xen distribution, IMHO. This probably isn't too > >>>> unreasonable to consider, but we'd want to make sure that the > >>>> XenAccess configuration is either simplified or eliminated to avoid > >>>> causing headaches for the average person using this stuff. Something > >>>> to think about... > >>>> > >>>> -bryan > >>>> > >>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos > >>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>> [From Patrick] > >>>>> > >>>>>> I like this idea as it keeps Xen as simple as possible and should also > >>>>>> help to reduce the number of notifications sent from Xen up to user > >>>>>> space (e.g., one notification to the daemon could then be pushed out > >>>>>> to multiple clients that care about it). > >>>>> > >>>>> Yeah, that was my general thinking as well. So the immediate change to > >>>>> the mem_event interface for this would be a way to specify sub-page > >>>>> level stuff. The best way to approach this is probably by specifying a > >>>>> start and end range (or more likely start address and size). This way > >>>>> things like swapping and sharing would specify the start address of > >>>>> the page they're interested in and PAGE_SIZE (or, more realistically > >>>>> there would be an additional lib call to do page-level stuff, which > >>>>> would just take the pfn and do this translation under the hood). > >>>>> > >>>>> > >>>>>> For what it's worth, I'd be happy to build such a daemon into > >>>>>> XenAccess. This may be a logical place for it since XenAccess is > >>>>>> already doing address translations and such, so it would be easier for > >>>>>> a client app to specify an address range of interest as a virtual > >>>>>> address or physical address. This would prevent the need to repeat > >>>>>> some of that address translation functionality in yet another library. > >>>>>> > >>>>>> Alternatively, we could provide the daemon functionality in libxc or > >>>>>> some other Xen library and only provide support for low level > >>>>>> addresses (e.g., pfn + offset). Then XenAccess could build on top of > >>>>>> that to offer higher level addresses (e.g., pa or va) using its > >>>>>> existing translation mechanisms. This approach would more closely > >>>>>> mirror the current division of labor between XenAccess and libxc. > >>>>> > >>>>> This sounds good to me. I'd lean towards the second approach as I > >>>>> think it's the better long-term solution. I'm a bit rusty on my > >>>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn > >>>>> translation code out into the library and have the mem_event daemon > >>>>> use that? I do remember reading through and borrowing XenAccess code > >>>>> (or at least the general mechanism) to do address translation stuff > >>>>> for other projects, so it seems like having a general way to do that > >>>>> would be a win. I think I did it with the CoW stuff, which I actually > >>>>> want to port to the mem_event interface as well, both to have it > >>>>> available and as another example of neat things we can do with the > >>>>> interface. > >>>>> > >>>>> > >>>>> Patrick > >>>>> > >>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos > >>>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>>> [From Bryan] > >>>>>> > >>>>>>> needs to know to do sync notification). What's everybody thoughts on > >>>>>>> this? Does it seem reasonable or have I gone completely mad? > >>>>>> > >>>>>> I like this idea as it keeps Xen as simple as possible and should also > >>>>>> help to reduce the number of notifications sent from Xen up to user > >>>>>> space (e.g., one notification to the daemon could then be pushed out > >>>>>> to multiple clients that care about it). > >>>>>> > >>>>>> For what it's worth, I'd be happy to build such a daemon into > >>>>>> XenAccess. This may be a logical place for it since XenAccess is > >>>>>> already doing address translations and such, so it would be easier for > >>>>>> a client app to specify an address range of interest as a virtual > >>>>>> address or physical address. This would prevent the need to repeat > >>>>>> some of that address translation functionality in yet another library. > >>>>>> > >>>>>> Alternatively, we could provide the daemon functionality in libxc or > >>>>>> some other Xen library and only provide support for low level > >>>>>> addresses (e.g., pfn + offset). Then XenAccess could build on top of > >>>>>> that to offer higher level addresses (e.g., pa or va) using its > >>>>>> existing translation mechanisms. This approach would more closely > >>>>>> mirror the current division of labor between XenAccess and libxc. > >>>>>> > >>>>>> -bryan > >>>>>> > >>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos > >>>>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>>>> [From Patrick] > >>>>>>> > >>>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've > >>>>>>>> missed something. But is the idea here to create a more general > >>>>>>>> interface that could support various different types of memory events > >>>>>>>> + notification? And the two events listed below are just a subset of > >>>>>>>> the events that could / would be supported? > >>>>>>> > >>>>>>> That's correct. > >>>>>>> > >>>>>>> > >>>>>>>> In general, I like the sound of where this is going but I would like > >>>>>>>> to see support for notification of events such as when a domU reads / > >>>>>>>> writes / execs a pre-specified byte(s) of memory. As such, there > >>>>>>>> would need to be a notification path (as discussed below) and also a > >>>>>>>> control path to setup the memory regions that the user app cares > >>>>>>>> about. > >>>>>>> > >>>>>>> Sub-page events is something I would like to have included as well. > >>>>>>> Currently the control path is basically just "nominating" a page (for > >>>>>>> either swapping or sharing). It's not entirely clear to me the best > >>>>>>> way to go about this. With swapping and sharing we have code in Xen to > >>>>>>> handle both cases. However, to just receive notifications (like > >>>>>>> "read", "write", "execute") I don't think we need specialised support > >>>>>>> (or at least just once to handle the notifications). I'm thinking it > >>>>>>> might be good to have a daemon to handle these events in user-space > >>>>>>> and register clients with the user-space daemon. Each client would get > >>>>>>> a unique client ID which could be used to identify who should get the > >>>>>>> response. This way, we could just register that somebody is interested > >>>>>>> in that page (or byte, etc) and let the user-space tool handle most of > >>>>>>> the complex logic (i.e. which of the clients should that particular > >>>>>>> notification go to). This requires some notion of priority for memory > >>>>>>> areas (e.g. if one client requests notification for access to a byte > >>>>>>> of page foo and another requests notification for access to any of > >>>>>>> page foo, then we only need Xen to store that it should notify for > >>>>>>> page foo and just send along which byte(s) of the page were accessed > >>>>>>> as well, then the user-space daemon can determine if both clients > >>>>>>> should be notified or just the one) (e.g. if one client requests async > >>>>>>> notification and another requests sync notification, then Xen only > >>>>>>> needs to know to do sync notification). What's everybody thoughts on > >>>>>>> this? Does it seem reasonable or have I gone completely mad? > >>>>>>> > >>>>>>> > >>>>>>> Patrick > >>>>>>> > >>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos > >>>>>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>>>>> [From Bryan] > >>>>>>>> > >>>>>>>> Bryan D. Payne > >>>>>>>> to Patrick, me, george.dunlap, Andrew, Steven > >>>>>>>> > >>>>>>>> show details Jun 16 (7 days ago) > >>>>>>>> > >>>>>>>> Patrick, thanks for the inclusion. > >>>>>>>> > >>>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've > >>>>>>>> missed something. But is the idea here to create a more general > >>>>>>>> interface that could support various different types of memory events > >>>>>>>> + notification? And the two events listed below are just a subset of > >>>>>>>> the events that could / would be supported? > >>>>>>>> > >>>>>>>> In general, I like the sound of where this is going but I would like > >>>>>>>> to see support for notification of events such as when a domU reads / > >>>>>>>> writes / execs a pre-specified byte(s) of memory. As such, there > >>>>>>>> would need to be a notification path (as discussed below) and also a > >>>>>>>> control path to setup the memory regions that the user app cares > >>>>>>>> about. > >>>>>>>> > >>>>>>>> -bryan > >>>>>>>> > >>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos > >>>>>>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>>>>>> [From Patrick] > >>>>>>>>> > >>>>>>>>> I think the idea of multiple rings is a good one. We'll register the > >>>>>>>>> clients in Xen and when an mem_event is reached, we can just iterate > >>>>>>>>> through the list of listeners to see who needs a notification. > >>>>>>>>> > >>>>>>>>> The person working on the anti-virus stuff is Bryan Payne from > >>>>>>>>> Georgia > >>>>>>>>> Tech. I've CCed him as well so we can get his input on this stuff as > >>>>>>>>> well. It's better to hash out a proper interface now rather than > >>>>>>>>> continually changing it around. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Patrick > >>>>>>>>> > >>>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos > >>>>>>>>> <grzegorz.milos@xxxxxxxxx> wrote: > >>>>>>>>>> [From Gregor] > >>>>>>>>>> > >>>>>>>>>> There are two major events that the memory sharing code needs to > >>>>>>>>>> communicate over the hypervisor/userspace boundary: > >>>>>>>>>> 1. GFN unsharing failed due to lack of memory. This will be called > >>>>>>>>>> the > >>>>>>>>>> 'OOM event' from now on. > >>>>>>>>>> 2. MFN is no longer sharable (actually an opaque sharing handle > >>>>>>>>>> would > >>>>>>>>>> be communicated instead of the MFN). 'Handle invalidate event' from > >>>>>>>>>> now on. > >>>>>>>>>> > >>>>>>>>>> The requirements on the OOM event are relatively similar to the > >>>>>>>>>> page-in event. The way this should operate is that the faulting > >>>>>>>>>> VCPU > >>>>>>>>>> is paused, and the pager is requested to free up some memory. When > >>>>>>>>>> it > >>>>>>>>>> does so, it should generate an appropriate response, and wake up > >>>>>>>>>> the > >>>>>>>>>> VCPU back again using a domctl. The event is going to be low > >>>>>>>>>> volume, > >>>>>>>>>> and since it is going to be handled synchronously, likely in tens > >>>>>>>>>> of > >>>>>>>>>> ms, there are no particular requirements on the efficiency. > >>>>>>>>>> > >>>>>>>>>> Handle invalidate event type is less important in the short term > >>>>>>>>>> because the userspace sharing daemon is designed to be resilient to > >>>>>>>>>> unfresh sharing state. However, if it is missing it will make the > >>>>>>>>>> sharing progressively less effective as time goes on. The idea is > >>>>>>>>>> that > >>>>>>>>>> the hypervisor communicates which sharing handles are no longer > >>>>>>>>>> valid, > >>>>>>>>>> such that the sharing daemon only attempts to share pages in the > >>>>>>>>>> correct state. This would be relatively high volume event, but it > >>>>>>>>>> doesn't need to be accurate (i.e. events can be dropped if they are > >>>>>>>>>> not consumed quickly enough). As such this event should be batch > >>>>>>>>>> delivered, in an asynchronous fashion. > >>>>>>>>>> > >>>>>>>>>> The OOM event is coded up in Xen, but it will not be consumed > >>>>>>>>>> properly > >>>>>>>>>> in the pager. If I remember correctly, I didn't want to interfere > >>>>>>>>>> with > >>>>>>>>>> the page-in events because the event interface assumed that > >>>>>>>>>> mem-event > >>>>>>>>>> responses are inserted onto the ring in precisely the same order as > >>>>>>>>>> the requests. This may not be the case when we start mixing > >>>>>>>>>> different > >>>>>>>>>> event types. WRT to the handle invalidation, the relevant hooks > >>>>>>>>>> exist > >>>>>>>>>> in Xen, and in the mem sharing daemon, but there is no way to > >>>>>>>>>> communicate events to two different consumers atm. > >>>>>>>>>> > >>>>>>>>>> Since the requirements on the two different sharing event types are > >>>>>>>>>> substantially different, I think it may be easier if separate > >>>>>>>>>> channels > >>>>>>>>>> (i.e. separate rings) were used to transfer them. This would also > >>>>>>>>>> fix > >>>>>>>>>> the multiple consumers issue relatively easily. Of course you may > >>>>>>>>>> know > >>>>>>>>>> of some other mem events that wouldn't fit in that scheme. > >>>>>>>>>> > >>>>>>>>>> I remember that there was someone working on an external anti-virus > >>>>>>>>>> software, which prompted the whole mem-event work. I don't remember > >>>>>>>>>> his/hers name or affiliation (could you remind me?), but maybe > >>>>>>>>>> he/she > >>>>>>>>>> would be interested in working on some of this? > >>>>>>>>>> > >>>>>>>>>> Thanks > >>>>>>>>>> Gregor > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel -- Tim Deegan <Tim.Deegan@xxxxxxxxxx> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |