[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] question about io path in the front/backend



> > The domU maintains its own caches and dom0 just supplies it with
> > data.  dom0 doesn't cache stuff on behalf of the domU - in fact, the
> > block backend driver bypasses the page cache in dom0 entirely and
> > transfers data directly into the domU's memory.
>
> But isn't that a performance hit?
> Cache in dom0 on the behalf of domU, may result in better performance,
> isn't it?After all dom0 is the domain which actually does I/O with the
> disk/IO device.Keeping domU's cache in dom0 sounds like increasing
> complexity though, i must admit.But i guess solutions to this could be
> found.
> Thoughts?

Thing is, the domU *knows* what data is useful to it and therefore which bits 
it wants to cache and which to throw away.  So in principle it can make 
better decisions.  Also, Linux in the domU is basically going to *want* to 
cache things anyhow, so unless you actually disabled that there may not be 
much point in adding another level of cache in dom0 (though I can imagine 
that could be useful in some circumstances).

By not caching stuff in dom0, you also avoid the possibility of domUs 
interfering so much (performance wise) with each other and with dom0 
applications by pushing relevant data out of the cache.  By each domain 
having their own private cache, you avoid contention for cache space.

Finally, it's worth noting that for applications the Linux page cache is able 
to make use of sharing between applications accessing the same data.  For 
virtual machines in Xen, if they're all using different virtual disks then 
there's not really anything readily shareable - they're accessing different 
data.  So as things are, it's harder to get such a benefit from using the 
page cache as you would for running applications in the same domain.

However...

Having private caches for each domU may be less efficient overall than having 
one big shared cache, even though it ensures better performance isolation.  
And if domains are using a copy-on-write block device (or something like 
that) then they may actually be accessing the same data sometimes.

For this reason, various people, including myself are exploring ways of 
getting domains to share their page caches in order to improve memory use and 
performance - particularly in the case where they're using some kind of 
shared storage.

Other VMMs (e.g. qemu, VMware, kvm, and others) do things differently, either 
by explicitly sitting *on top* of the Linux page cache or by implementing a 
custom page sharing mechanism (in the case of VMware ESX, which is a 
hypervisor rather different to Xen).  This means that those VMMs can or do 
use shared caching in order to improve memory usage and performance.

Cheers,
Mark

> Thanks,
>       Pradeep
>
> > > and what about
> > > the granttable's function? does granttable (or shared page filled
> > > with I/Oreqest or I/O data) function as the cache or buffer in the
> > > native linux I/O path?
> >
> > The pages which the frontend domain granted to the backend are the
> > sources or destinations of the IO data.  They are not part of the
> > page cache in dom0. dom0 maps them, then creates BIOs pointing to
> > them and submits these to the block layer.  dom0's block layer gets
> > the data from the device and puts it into this mapped domU memory (or
> > takes data from memory and puts it on disk).
> >
> > In the domU, the pages form part of that domain's own private page
> > cache but dom0 does not know about this.
> >
> > > or does the shared page between the front/backend act
> > > only as transferring the data and I/Orequest , or does it have a
> > > cache or buffer function as the cache or buffer such as bio in the
> > > native linux?
> >
> > The shared ring page between front and back ends is just used for
> > transferring details of requests (like "get this data in memory, and
> > put it here on disk").  i.e. The shared ring contains request
> > metadata.  The pages that are temporarily mapped from the domU into
> > dom0 contain the actual data.
> >
> > Cheers,
> > Mark
> >
> > > Thanks in advance
> > >
> > > Mark Williamson åé:
> > > >>   I have read some documents and wiki about split driver in
> > > >> xen,and I am confused about the I/O path ,in which a sys_read()
> > > >> pass through the domU and dom0,does sys_read() in the domU pass
> > > >> through vfs and ,say ,ext3fs in domU,and insert request into the
> > > >> requeest_queue of the frontend-driver,is it right?
> > > >
> > > > Sounds like you have the right idea.  Requests get queued with the
> > > > frontend driver in terms of Linux structures.  IO requests to
> > > > satisfy these are then placed into the shared memory ring so that
> > > > the backend can find out what we're asking for.
> > > >
> > > >>   and then ,say domU sets up with a *.img file in the dom0, then
> > > >> what does frontend and backend driver do?
> > > >> does frontend transmit the request to the backend ,is it right?
> > > >
> > > > Yes, the frontend does this by putting requests into the shared
> > > > memory ringbuffer which is also accessible by the backend.  The
> > > > frontend then sends and event to the backend; this causes an
> > > > interrupt in the backend so that it knows it must check the
> > > > shared memory.
> > > >
> > > >>   and then what does backend driver do ? does backend transfer
> > > >> the request to the phyiscal driver in the dom0 ,is it right?
> > > >
> > > > Yes.  The backend responds to the interrupt by checking the
> > > > shared memory for new requests, then it maps parts of the domUs
> > > > memory so that dom0 will be able to write data into it.  Then it
> > > > submits requests to the Linux block IO subsystem to fill that
> > > > memory with data.  The Linux block IO system eventually sends
> > > > these requests to the device driver, to do the IO directly into
> > > > the mapped domU memory.
> > > >
> > > >>  or does backend transfer the request into some
> > > >> read()operation ,and submit it to the vfs and ,say,ext3fs in
> > > >> dom0, and do another relatively complete io path in the dom0,is
> > > >> it right?
> > > >
> > > > If you're just exporting a phy: device to the guest, then the
> > > > block IO requests go down to the block device driver for that
> > > > device and are serviced there.  e.g. if I export IDE driver
> > > > phy:/dev/hda to my guest, then the IDE driver will satisfy the IO
> > > > requests directly. Requests go backend -> block layer -> real
> > > > device driver
> > > >
> > > > If you're using a file: device then you have to go through the
> > > > filesystem layer...  So the IO requests go backend -> block layer
> > > > -> loopback block device -> ext3 -> block layer (again) -> real
> > > > device driver
> > > >
> > > > If you're using blktap then the requests take a trip via
> > > > userspace before getting submitted.
> > > >
> > > >>  or  if  backend  transfer the request to physical driver
> > > >> directly, how does the backend deal with the request's virtual
> > > >> address ,and how does backend manage bio buffer ,does physical
> > > >> driver and backend and frontend share the bio buffer in  some
> > > >> way, or what does xen deal with it ?
> > > >
> > > > I hope what I've said clarifies things a bit.
> > > >
> > > > Cheers,
> > > > Mark



-- 
Dave: Just a question. What use is a unicyle with no seat?  And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.