Does arena_tcache_fill_small() ever end up bypassing custom chunk allocation?

Wed Feb 11 06:26:59 PST 2015

Okay, sorry for the stream of consciousness last night. I was completely wrong about the source of our problem. Specifying MALLOCX_ARENA() used to implicitly bypass the cache, but now I see MALLOCX_TCACHE_NONE as a separate option. Adding that to our flags fixes the immediate issue we were having.

I have some responses to the previous issue that I’ve included inline.

> On Feb 11, 2015, at 12:45 AM, Jason Evans <jasone at canonware.com> wrote:
> 
> On Feb 10, 2015, at 8:29 PM, D'Alessandro, Luke K <ldalessa at indiana.edu> wrote:
>>> On Feb 10, 2015, at 11:00 PM, D'Alessandro, Luke K <ldalessa at indiana.edu> wrote:
>>>> On Feb 10, 2015, at 10:52 PM, D'Alessandro, Luke K <ldalessa at indiana.edu> wrote:
>>>> We have an arena that we are using for a specific bit of memory that we are managing with a custom chunk allocator.
>>>> 
>>>> We’re seeing initial small allocations through this arena succeed without calling out to our custom chunk allocator. This behavior appears to be new since b617df8. While tracing this behavior, I’m seeing the initial allocations miss in the cache, and forward to arena_tcache_fill_small() which does some difficult-to-disect stuff including something to do with runs. After this call, the cache succeeds in supplying the allocation, without ever getting our chunk allocator involved.
> 
> 10aff3f3e1b8b3ac0348b259c439c9fe870a6b95 had a lot of a0-related changes in it, which may be related to the behavior change you're seeing.
> 
>>>> This causes us issues because the memory we’re getting back doesn’t seem to come from the place we’re expecting it to come from.
>>>> 
>>>> Is arena_tcache_fill_small() going somewhere special to find memory that was allocated with some other chunk allocator, previous to our initialization of our arena?
>>>> 
>>> Ah, actually, I see one call happening to the default chunk allocator for another arena during initialization.
>>> 
>>> We’re calling mallctl with opt.lg_chunk, and that triggers a really early a0malloc() inside of ctl_init(), which cascades into the chunk_alloc_default before we’ve had a chance to set our custom allocator on the default arena. I guess that arena_tcache_fill_small() uses this chunk to satisfy misses for small objects, which makes sense.
> 
> Is the problem that the region that is allocated from arena 0 ends up being freed back to the tcache, and you're depending on the tcache only containing regions from your custom arena?  

I think that could be a problem, but I’m not sure it’s actually happening.

> In that case you could flush the tcache once you have the arena fully configured.

I do that, right before swapping in a custom arena, but I was worried that jemalloc was filling the cache from the arena 0 chunk that get’s allocated right away. If that ever happens I’m out of luck. I’ve added some asserts to make sure that we’re always getting memory back from a chunk that we didn’t provide—it may be that this is a complete non-issue

> I'm a bit confused about what your code is trying to do, but maybe you're hitting one of these problems:
> 
> - You're trying to change the chunk allocator for one of the automatic arenas (arena 0?).  This is unlikely to ever work reliably for arena 0, though it would potentially be possible for other automatic arenas prior to launching any threads.

We swap in custom arenas for all of the threads to avoid this issue. I was simply confused about seeing the default chunk allocator firing, but now I understand when and why that happens, and it wasn’t the cause of this issue.

> - You're calling the thread.arena mallctl to refer to a newly created arena before you've finished setting up the arena to use your chunk allocator.
> 
> I still have a bit of work to do on making sure that no metadata are allocated from non-auto arenas, so that they can be reset (see https://github.com/jemalloc/jemalloc/issues/146).  Is that related to what you're hitting?

Not directly. As long as our cache never contains objects from the “primordial” chunk, we’re okay. We never free anything from that chunk, so as long as jemalloc doesn’t try and fill from that chunk (given that I’m no longer using it’s arena I don’t think that it should), we’re okay.

As it is, linking our prefixed jemalloc to manage our network-registered memory region has been working quite well up to this point.

Thank you,
Luke