Enabling profiling causes hang?

Jason Evans jasone at canonware.com
Fri Feb 6 22:47:05 PST 2015

On Feb 6, 2015, at 6:40 PM, Brock Pytlik <brock.pytlik at gmail.com> wrote:
> I'm trying to track down a possible memory leak in ardb, which uses jemalloc, when running on an arm processor.
> The problem I have is that when I set 'MALLOC_CONF=prof_leak:true' and start ardb, the program hangs.
> When I strace it, this is the system call it does:
> futex(0xb6db3c0c, FUTEX_WAIT_PRIVATE, 2, NULL
> Here's the stack trace gdb reports:
> (gdb) bt 
> #0  0xb6e1af4c in __lll_lock_wait_private () from /lib/libc.so.6
> #1  0xb6d79de4 in __new_exitfn () from /lib/libc.so.6
> #2  0xb6d79e34 in __internal_atexit () from /lib/libc.so.6
> #3  0x0016d75c in je_prof_boot2 () at src/prof.c:1349
> #4  0x00143418 in malloc_init_hard () at src/jemalloc.c:767
> #5  0x0014db70 in malloc_init () at src/jemalloc.c:292
> #6  calloc (num=1, size=520) at src/jemalloc.c:1123
> #7  0xb6d79da4 in __new_exitfn () from /lib/libc.so.6
> #8  0xb6d79e34 in __internal_atexit () from /lib/libc.so.6
> #9  0x00013410 in __static_initialization_and_destruction_0 (
>     __initialize_p=<optimized out>, __priority=<optimized out>)
>     at <>/include/c++/4.7.3/iostream:75
> #10 _GLOBAL__sub_I_channel.cpp(void) () at common/channel/channel.cpp:738
> #11 0x001786f8 in __libc_csu_init ()
> #12 0xb6d61834 in __libc_start_main () from /lib/libc.so.6
> #13 0x0001424c in _start ()
> I'll admit at this point I'm kind of stumped as to how to proceed. I thought I'd start by asking here in case anyone had seen similar behavior or knew what the problem was. I checked the issues at the github page and couldn't find anything similar. This is using version 3.6.0 of jemalloc.
> When I looked at the jemalloc.xml user manual in the gate tip (not the 3.6.0 branch I'm using) I saw some discussion of atexit being problematic in the context of prof_final. If that's the problem, is there a way to generate profiling information either while the program is running, avoiding the atexit issue? I didn't quite follow the comment in the user manual that says that an application can register its own atexit parameter with equivalent functionality. Is there an example someplace I could crib from or could someone help me understand that a bit?

Yes, it looks like you're hitting the same atexit() issue that caused me to disable prof_final by default in the dev branch of jemalloc.  If you instead use MALLOC_CONF=prof_final:false,lg_prof_interval=20 (see http://www.canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html#opt.lg_prof_interval and choose an appropriate dump interval), you should be able to get useful heap profiles without modifying the ardb source code.  The prof_gdump option provides an alternative mechanism for triggering heap dumps.  If you want to go as far as modifying ardb source code, you can call the prof.dump mallctl to trigger a dump.  None of these options will cause jemalloc to print a leak report during exit, but the resulting heap profiles are what you really need anyway.


