jemalloc 3 performance vs. mozjemalloc

Jason Evans jasone at canonware.com
Tue Feb 3 16:22:35 PST 2015


On Feb 3, 2015, at 4:00 PM, Mike Hommey <mh at glandium.org> wrote:
> On Wed, Feb 04, 2015 at 07:51:17AM +0900, Mike Hommey wrote:
>> 
>> - The average number of mutex lock per alloc/dealloc is close to 1 with
>>  mozjemalloc (1.001), but 1.13 with jemalloc 3 (same testcase as above).
>>  Fortunately, contention is likely lower (I measured it to be lower, but
>>  the instrumentation had so much overhead that it may have skewed the
>>  results), but pthread_mutex_lock/unlock are not free as far as
>>  instruction count is concerned.
> 
> Forgot to mention, this is with tcache disabled. Tcache does make
> instruction count significantly lower and does much less mutex locking,
> but at the cost of more memory overhead. We'll investigate the
> tradeoffs, but we're not ready for that yet.

Oh!  mozjemalloc only has one mutex per arena, whereas jemalloc 1+ has per bin mutexes as well.  In the fast path only the bin mutex is needed for a small allocation/deallocation, but if a page run has to be allocated/deallocated, additional locking occurs.  In the absence of tcache this increase in locking makes sense, though it's a bit higher than I'd normally expect.

Thanks,
Jason


More information about the jemalloc-discuss mailing list