keeping memory usage at certain limit

Antony Dovgal antony.dovgal at gmail.com
Thu May 1 01:36:37 PDT 2014


Hello Jason,

On 05/01/2014 06:43 AM, Jason Evans wrote:
> Use "thread.tcache.flush" to flush thread caches; "arena.<i>.purge" merely uses madvise(2)
>to inform the kernel that it can recycle dirty pages that contain unused data.

According to the docs "thread.tcache.flush" only flushes the cache of the calling thread and
I have a lot of threads running in thread pools, which are created at the start and never destroyed.
Or did you mean to call it periodically from every thread?

> There are two statistics jemalloc tracks that directly allow you to measure external fragmentation: "stats.allocated" [1] and "stats.active" [2].

Right, I've tried using both of them.
Do I understand it correctly that stats.active decreases only when an entire page is freed?

> jemalloc's worst case fragmentation behavior is pretty straightforward to reason about for small objects.  Each size class [3] can be considered independently.  The worst thing that can possibly happen is that after the application reaches its maximum usage, it then frees all but one allocated region in each page run.  However, your application is presumably reaching a stable number of allocations, then replacing old data with new.  If the total number of allocated regions for each size class remains stable in the steady state, then your application should suffer very little fragmentation.  However, if your application maintains the same total memory usage, but shifts from, say, mostly 48-byte regions to mostly 64-byte regions, it can end up with highly fragmented runs that contain the few remaining 48-byte allocations.
>Given 28 small size classes, it's possible for this to be a terrible fragmentation situation, but I have yet to see this happen in a real application.

So far, using Salvatore's method and code I can see about 3% difference between RSS and allocated memory
when using jemalloc and ~9% difference when using Hoard.
But I expect these values to change since the processes haven't started removing outdated records yet.

I also have a control process without jemalloc (i.e. using plain libc malloc()) using the same code to compute fragmentation
and it shows about 20% difference (and it's growing).


What buffles me most is that stats.allocated keeps returning the same value, but RSS constantly grows.
Could it be because of the amount of threads I use?
Say, I free memory in one thread and try to allocate in another one, but the second thread
doesn't have it cached and has to do the actual allocation?

-- 
Wbr,
Antony Dovgal
---
http://pinba.org - realtime profiling for PHP


More information about the jemalloc-discuss mailing list