Memory usage regression

Tue Oct 30 11:01:21 PDT 2012

On Tue, Oct 30, 2012 at 05:03:38PM +0100, Mike Hommey wrote:
> So, what seems to be happening is that because of that fragmentation, when
> requesting big allocations, jemalloc has to allocate and use new chunks.
> When these big allocations are freed, the new chunk tends to be used
> more often than the other free chunks, adding to the fragmentation, thus
> requiring more new chunks for big allocations.
> 
> The following dumb patch essentially plugs the leak for the Firefox usecase:
> 
> diff --git a/src/arena.c b/src/arena.c
> index 1e6964a..38079d7 100644
> --- a/src/arena.c
> +++ b/src/arena.c
> @@ -471,7 +471,7 @@ arena_run_alloc_helper(arena_t *arena, size_t size, bool large, size_t binind,
>         arena_chunk_map_t *mapelm, key;
>  
>         key.bits = size | CHUNK_MAP_KEY;
> -       mapelm = arena_avail_tree_nsearch(&arena->runs_avail_dirty, &key);
> +       mapelm = arena_avail_tree_nsearch(&arena->runs_avail_clean, &key);
>         if (mapelm != NULL) {
>                 arena_chunk_t *run_chunk = CHUNK_ADDR2BASE(mapelm);
>                 size_t pageind = (((uintptr_t)mapelm -
> @@ -483,7 +483,7 @@ arena_run_alloc_helper(arena_t *arena, size_t size, bool large, size_t binind,
>                 arena_run_split(arena, run, size, large, binind, zero);
>                 return (run);
>         }
> -       mapelm = arena_avail_tree_nsearch(&arena->runs_avail_clean, &key);
> +       mapelm = arena_avail_tree_nsearch(&arena->runs_avail_dirty, &key);
>         if (mapelm != NULL) {
>                 arena_chunk_t *run_chunk = CHUNK_ADDR2BASE(mapelm);
>                 size_t pageind = (((uintptr_t)mapelm -
> 
> 
> My test program changed in the meanwhile, so i can't do accurate
> comparisons with mozjemalloc without re-running more tests. I'll post
> again when I have more data.

Here's some comparison between jemalloc3 (tcache=false, narenas=1,
lg_chunk=20) with the patch above and mozjemalloc.
The graph shows (mozjemalloc - jemalloc3) / jemalloc3 for each value.

mozjemalloc has a lower rss after closing tabs, but jemalloc3 has a
lower rss when all tabs are opened. VmData difference is in an
acceptable range.
"Usable" is the total of malloc_usable_size() for all allocations, which
means jemalloc allocates more "unrequested" memory than mozjemalloc.
That could certainly contribute to some additional fragmentation and
to the RSS difference, by extension.

Mike