On Tue, 2011-11-08 at 16:51 +0000, Shriram Rajagopalan wrote:
> On Fri, Nov 4, 2011 at 12:21 PM, Shriram Rajagopalan
> <
rshriram@xxxxxxxxx> wrote:
> Why posix_memalign?
>
> The compression code involves a lot of memcpys at 4K
> granularity (dirty pages
> copied from domU's memory to internal cache/page buffers etc).
> I would like to
> keep these memcpys page aligned for purposes of speed. The
> source pages
> (from domU) are already aligned. The destination pages
> allocated by the
> compression code need to be page aligned.
>
> correct me if I am wrong:
> mallocing a huge buffer for this purpose is not optimal.
> malloc aligns allocations
> on 16byte (or 8byte) granularity but if a 4K region straddles
> across two physical
> memory frames, then the memcpy is going to be suboptimal.
> OTOH, memalign
> ensures that we are dealing with just 2 memory frames as
> opposed
> to 3 (possible) frames in malloc.
>
> A simple 8Mb memcpy test shows an average of 500us overhead
> for malloc
> based allocation compared to posix_memalign based allocation.
> While this
> might seem low, the checkpoints are being taken at high
> frequency
> (every 20ms for instance).
>
> It is not okay to use malloc on other platforms. I simply dont
> have access to other
> platforms to test their equivalent versions. Short of using
> something
> like qemu_memalign function.
>
> I am open to suggestions :)