On Tue, 2011-11-08 at 17:13 +0000, Shriram Rajagopalan wrote:
> On Tue, Nov 8, 2011 at 9:02 AM, Ian Campbell <
Ian.Campbell@xxxxxxxxxx>
> wrote:
> On Tue, 2011-11-08 at 16:51 +0000, Shriram Rajagopalan wrote:
> > On Fri, Nov 4, 2011 at 12:21 PM, Shriram Rajagopalan
> > <
rshriram@xxxxxxxxx> wrote:
> > Why posix_memalign?
> >
> > The compression code involves a lot of memcpys at 4K
> > granularity (dirty pages
> > copied from domU's memory to internal cache/page
> buffers etc).
> > I would like to
> > keep these memcpys page aligned for purposes of
> speed. The
> > source pages
> > (from domU) are already aligned. The destination
> pages
> > allocated by the
> > compression code need to be page aligned.
> >
> > correct me if I am wrong:
> > mallocing a huge buffer for this purpose is not
> optimal.
> > malloc aligns allocations
> > on 16byte (or 8byte) granularity but if a 4K region
> straddles
> > across two physical
> > memory frames, then the memcpy is going to be
> suboptimal.
> > OTOH, memalign
> > ensures that we are dealing with just 2 memory
> frames as
> > opposed
> > to 3 (possible) frames in malloc.
> >
> > A simple 8Mb memcpy test shows an average of 500us
> overhead
> > for malloc
> > based allocation compared to posix_memalign based
> allocation.
> > While this
> > might seem low, the checkpoints are being taken at
> high
> > frequency
> > (every 20ms for instance).
> >
> > It is not okay to use malloc on other platforms. I
> simply dont
> > have access to other
> > platforms to test their equivalent versions. Short
> of using
> > something
> > like qemu_memalign function.
> >
> > I am open to suggestions :)
>
>
> This is due to minios (aka stubdoms) not having
> posix_memalign, right?
>
> minios (or rather newlib) does appear to have memalign though,
> which if
> true would also work, right? You could potentially also
> implement
> posix_memalign in terms of memalign on minios and avoid the
> ifdef.
>
>
> Sounds good. In that case, can I just post a patch to minios,
> implementing posix_memalign and will you then directly take the
> previous version V4 of this patch series (the one without #ifdefs) ?