RFC: TCMalloc-style new/delete hooks

Wed Oct 15 02:14:08 PDT 2014

My 2 cents are that, at least for Redis, creating application-level
wrappers was a better strategy.
For memory tracking tasks, the application-level approach helped to
abstract away the logic from the allocator itself, so that different
allocators can be used: when the allocator used has certain low-level
functionalities they are exploited (for example,
je_malloc_usable_size). When the task is instead objects cleanup, my
opinion is that most of the times you also need to implement some way
to also deal with multiple references, which is, reference counting,
so an higher layer is also the best fit.

Salvatore

On Tue, Oct 14, 2014 at 7:55 PM, Jason Evans <jasone at canonware.com> wrote:
> On Oct 14, 2014, at 9:13 AM, David Rigby <daver at couchbase.com> wrote:
>> We are currently using TCMalloc as our memory allocator, however the significantly better fragmentation characteristics and deterministic lowest-available address selection of jemalloc means we want switching to jemalloc in the near future.
>>
>> One of (the only?) sticking points however is the lack of a direct equivalent to TCMalloc’s new/delete hooks, which allow an application to register callbacks when memory is allocated/freed by the application.
>>
>> We use this feature to essentially perform sub-heap memory tracking, to determine how much memory different buckets (think tables/databases) are using. To be more specific, as a worker thread is assigned to a particular bucket the bucket ID is stored in TLS, and then when a new/delete callback is invoked we lookup the thread’s current bucket from TLS and increment/decrement the total used as appropriate.
>>
>> To allow us to work with jemalloc, I’ve implemented[1] equivalent functionality in jemalloc.
>>
>> I did consider making use of the arena functionality in jemalloc for this, but I was concerned about the potential increase in fragment ion with many arenas, which is exactly one of the reasons why we want to move away from TCMalloc (I’m proposing setting narenas=1 when we deploy).
>>
>> How would you (Jason?) feel about merging this patch, or something conceptually similar into upstream?
>>
>> [1]: https://github.com/daverigby/jemalloc/commit/bbf3877d785417f03671bd1aed94723d750937d5
>
> I have some concerns about this functionality that have kept me from adding it so far:
>
> - It adds yet another branch to the fast path, whereas if you create your own wrappers and mangle jemalloc's API, it imposes no cost on applications which don't need hooks.
> - It's really tricky (and requires a messy API) to support hooks that get called for all allocations from the beginning of program execution.  I don't know of a way to pull this off short of exposing weak function pointer symbols that can be overridden during static linking or dynamic loading.
> - It can result in really surprising "impossible" behavior if the compiler makes assumptions about globally visible side effects, as does gcc.  In order to make hooks generally safe, the application must be compiled with -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free.  Other compilers potentially have similar issues, possibly without escape hatches.  I worry that hooks add a documentation burden on jemalloc, and that people will repeatedly fail to take note of this requirement, leaving them with the impression that jemalloc is somehow flakey.
>
> Jason
> _______________________________________________
> jemalloc-discuss mailing list
> jemalloc-discuss at canonware.com
> http://www.canonware.com/mailman/listinfo/jemalloc-discuss

-- 
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

"Fear makes the wolf bigger than he is."
       — German proverb