keeping memory usage at certain limit

Wed Apr 30 19:43:31 PDT 2014

On Apr 28, 2014, at 4:08 AM, Antony Dovgal <antony.dovgal at gmail.com> wrote:
> I'm currently working on a daemon that processes a lot of data and has to store the most recent of it.
> Unfortunately, memory is allocated and freed in small blocks and in totally random (for the allocator) manner.
> I use "stats.allocated" to measure how much memory is currently in use, delete the oldest data when the memory limit is reached and purge thread caches with "arena.N.purge" from time to time.

Use "thread.tcache.flush" to flush thread caches; "arena.<i>.purge" merely uses madvise(2) to inform the kernel that it can recycle dirty pages that contain unused data.

> The problem is that keeping stat.allocated on a certain level doesn't keep the process from growing until it's killed by OOM-killer.
> I suspect that this is caused by memory fragmentation issues, though I've no idea how to prove it (or at least all my ideas involve complex stats and are quite inefficient).
> 
> So my main questions are:
> is there any way to see how much memory is currently being (under)used because of fragmentation in Jemalloc?

There are two statistics jemalloc tracks that directly allow you to measure external fragmentation: "stats.allocated" [1] and "stats.active" [2].  allocated/active tells you the proportion of allocated memory within active pages, i.e. external fragmentation.

In a later email you report merged arena stats (which exclude huge allocations), for which allocated/active is .918, i.e. 8% external fragmentation.  The application has 1534 active allocating threads, which may be retaining a few GiB in their thread caches depending on how the application behaves.  There are some top-level statistics that might be relevant, in particular the total number of chunks.  The application has roughly 20 GiB of large allocations, and it's possible that chunk-level fragmentation is causing a huge amount of virtual memory usage (as well as chunk metadata overhead).

> is there a way to prevent it or force some garbage collection?

jemalloc's worst case fragmentation behavior is pretty straightforward to reason about for small objects.  Each size class [3] can be considered independently.  The worst thing that can possibly happen is that after the application reaches its maximum usage, it then frees all but one allocated region in each page run.  However, your application is presumably reaching a stable number of allocations, then replacing old data with new.  If the total number of allocated regions for each size class remains stable in the steady state, then your application should suffer very little fragmentation.  However, if your application maintains the same total memory usage, but shifts from, say, mostly 48-byte regions to mostly 64-byte regions, it can end up with highly fragmented runs that contain the few remaining 48-byte allocations.  Given 28 small size classes, it's possible for this to be a terrible fragmentation situation, but I have yet to see this happen in a real application.

Please let us know what you discover, especially if there might be some general improvement to jemalloc that would help your application.

Thanks,
Jason

[1] http://www.canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#stats.allocated
[2] http://www.canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#stats.active
[3] http://www.canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#size_classes