Diagnosing an out-of-memory issue

Thu Jul 7 14:28:01 PDT 2016

On Jun 26, 2016, at 1:00 PM, Matthew Fleming <mdf at purestorage.com> wrote:
> I'm not sure which details will be relevant so I may be including too much info below.  I'm using jemalloc with custom hooks to manage about 54GB of virtual space under Linux on x86_64. The hooks manage the address space in 2MB chunks so I can use HUGETLB for the mappings. Slightly simplified, the hooks do the following (roughly as expected, I think):
> 
>     alloc: mmap size bytes, aligned appropriately
>     dalloc: munmap the space (not really, I recycle the memory internally, but it's logically the same)
>     commit: return false
>     decommit: return true
>     purge: return true
>     split: return false
>     merge: return false
> 
> We're experiencing some out-of-memory issues, mostly due to a known runaway allocation site that we're working to fix. However, while debugging this I'm seeing some numbers for jemalloc usage that leave me concerned.  At the time of the OOM, I can see that we indeed have 54GB of virtual space used (we're using rlimit to set a 54GB limit for the process).
> 
> However, I also see the following from je_malloc_stats_print at the time we cross the 54GB virtual threshold which seems low to me:
> 
> Allocated: 38825889776, active: 47743795200, metadata: 1304838720, resident: 50520985600, mapped: 56247713792
> Current active ceiling: 47758442496
> 
> [...]
> 
> Where I'm wondering if I'm mis-using jemalloc is in the allocated vs active vs mapped numbers. Allocated/active implies 0.813 utilization; is this expected? Active/mapped adds further gives a 0.848 utilization; is this expected? It seems somewhere between unfortunate and buggy that jemalloc calls my alloc hook for more virtual/physical space, when there's only 69% of the total mapped space used. This turns my 54GB vmem limit into something like a 37GB limit on actual allocations, a loss of 17GB!

0.813 utilization isn't great, but it isn't awful either.  You may be able to substantially improve this by decreasing the number of arenas, so that threads don't cause so much per arena usage fluctuation.

Regarding low virtual memory utilization, I'm guessing that the chunks are too fragmented to service the 16 KiB requests that appear to be the most common size class in your application.

> One thing I think I did wrong that I am fixing is that I had set opt.lg_tcache_max: 21; based on the actual use of the system I don't think I need to have a per-thread cache for anything over 16kB. I have no visibility (even slightly laggy) to how much memory is held in the tcache, though. This would be a nice addition to the available stats, even if the number isn't completely accurate.

We will hopefully have such stats in 5.x (see https://github.com/jemalloc/jemalloc/pull/380).

Thanks,
Jason