jemalloc Suitable for embedded environments
Mayank Kumar (mayankum)
mayankum at cisco.com
Mon May 11 13:19:39 PDT 2015
Thanks Jason. I don't think code size should be a big issue for us or I will disable stats/profiling in production. I have a few more questions and some of them may be dumb, but I will still risk asking those here:-
-our processes use setrlimit to limit virtual memory usage of processes. Do you think jemalloc in someways could overshoot that limit or it might be doing something funky which is not tracked through setrlimit(like not going through brk/mmap/mremap). Please excuse my limited understanding here.
-someone pointed this link to me . http://locklessinc.com/benchmarks_allocator.shtml
It says the following stuff
This is a very good allocator when there is a large amount of contention, performing similarly to the Lockless memory allocator as the number of threads grows larger than the number of processors. However, when the number of allocating threads is smaller than the total number of cpus, it isn't quite as fast. The disadvantage of the jemalloc allocator is its memory usage. It uses power-of-two sized bins, which leads to a greatly increased memory footprint compared to other allocators. This can affect real-world performance due to excess cache and TLB misses.
Do you think it is still true, this might be an old link or just my limited understanding. Off course they are selling here...., but justed wanted your opinion here. For our case, though the allocating threads will be always larger than number of cores.
-has anyone seen issues with using jemalloc on windriver linux or compiling/linking with their toolchain ?
I am just trying to research this a little more while I am testing this with different scenarios. Thanks for your help
From: Jason Evans [mailto:jasone at canonware.com]
Sent: Thursday, May 07, 2015 7:08 PM
To: Mayank Kumar (mayankum)
Cc: jemalloc-discuss at canonware.com
Subject: Re: jemalloc Suitable for embedded environments
On May 7, 2015, at 4:01 PM, Mayank Kumar (mayankum) <mayankum at cisco.com> wrote:
> --what specifically causes the code size bloat ?
jemalloc implements several features that aren't strictly necessary, which is counter to the nature of highly constrained embedded systems. Thread caches, extensive statistics collection, heap profiling, etc. all require extra code. Additionally, the core algorithms are more sophisticated than those of simpler allocators, which also requires extra code. I just built a dev version of jemalloc on FreeBSD as such:
$ EXTRA_CFLAGS="-Os" ./autogen.sh --disable-stats --disable-tcache --disable-fill
$ strip -g lib/libjemalloc.so.2
$ ls -l lib/libjemalloc.so.2
-rwxr-xr-x 1 jasone wheel 182856 May 7 19:02 lib/libjemalloc.so.2
179 KiB is by no means svelt for a malloc implementation.
> --it is comforting to hear that the jemalloc is already part of FreeBSD. I would like to know which version of jemalloc is part of FreeBSD releases now ? Also does the FreeBSD distribution of jemalloc includes all the enhancements done for Facebook or is it some stripped down version?
IIRC it's somewhere in the 3.5.1-3.6.0 range for FreeBSD 10. I plan to commit version 4 to FreeBSD-11 CURRENT within the next month or so.
More information about the jemalloc-discuss