Help with a segfault

Jeff Hammond jeff.science at gmail.com
Tue Oct 7 10:36:50 PDT 2014


You might ask NERSC about this directly, assuming you are using NERSC
Cray systems.

I know that other memory allocators in software I use on Cray need to
be explicitly aware of the hugepages feature otherwise they segfault.

Jeff

On Mon, Oct 6, 2014 at 10:59 PM, Marcin Zalewski
<marcin.zalewski at gmail.com> wrote:
> I am using the jemalloc from 4dcf04bfc03b9e9eb50015a8fc8735de28c23090 on a
> Cray system. We use jemalloc for all allocations, and I get a strange issue
> with Crays hugepages implementation. When I do not use the Cray hugepages
> module, my code runs fine. However, when I load hugepages64M, I get the
> following segmentation fault:
>
> Program received signal SIGSEGV, Segmentation fault.
> je_chunk_alloc_default (size=2048, alignment=0, zero=0x7fffffffa96f,
>     arena_ind=0) at chunk.c:254
> 254             return (chunk_alloc_core(size, alignment, false, zero,
> (gdb) bt
> #0  je_chunk_alloc_default (size=2048, alignment=0, zero=0x7fffffffa96f,
>     arena_ind=0) at chunk.c:254
> #1  0x000000002001586f in je_huge_palloc (tsd=0x2aaab02092d0,
>     arena=<optimized out>, size=size at entry=2048, alignment=0,
>     zero=zero at entry=true) at huge.c:50
> #2  0x0000000020015908 in je_huge_malloc (tsd=<optimized out>,
>     arena=<optimized out>, size=size at entry=2048, zero=zero at entry=true)
>     at huge.c:19
> #3  0x0000000020018c90 in je_icalloct (arena=<optimized out>,
>     try_tcache=<optimized out>, size=2048, tsd=<optimized out>)
>     at
> ../../../contrib/jemalloc/include/jemalloc/internal/jemalloc_internal.h:662
> #4  imallocx_flags (arena=<optimized out>, try_tcache=<optimized out>,
>     zero=true, alignment=0, usize=2048, tsd=<optimized out>) at
> jemalloc.c:1450
> #5  imallocx_no_prof (usize=<synthetic pointer>, flags=<optimized out>,
>     size=<optimized out>, tsd=<optimized out>) at jemalloc.c:1531
> #6  libxxx_mallocx (size=<optimized out>, flags=<optimized out>)
>     at jemalloc.c:1550
> #7  0x00002aaaaf6b9445 in register_printf_type () from /lib64/libc.so.6
> #8  0x00002aaaabf019c0 in register_printf_flt128 ()
>     at ../../../cray-gcc-4.9.0/libquadmath/printf/quadmath-printf.c:390
> #9  0x00002aaaabf09de6 in __do_global_ctors_aux ()
>    from /opt/gcc/4.9.0/snos/lib64/libquadmath.so.0
> #10 0x00002aaaabee51fb in _init ()
>    from /opt/gcc/4.9.0/snos/lib64/libquadmath.so.0
> #11 0x00007fffffffaaf8 in ?? ()
> #12 0x00002aaaaaab91b8 in call_init () from /lib64/ld-linux-x86-64.so.2
> #13 0x00002aaaaaab92e7 in _dl_init_internal () from
> /lib64/ld-linux-x86-64.so.2
> #14 0x00002aaaaaaabb3a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
> #15 0x0000000000000001 in ?? ()
> #16 0x00007fffffffb209 in ?? ()
> #17 0x0000000000000000 in ?? ()
>
> I know that this is not very much info to go on, but I wonder if it rings a
> bell for someone immediately. As far as I can understand, the Cray hugepages
> module silently changes all the pages to hugepages of a chosen size:
>
> http://www.nersc.gov/users/computational-systems/hopper/programming/tuning-options/
>
> What could be an obvious reason to cause the segmentation fault on that
> line? The line in question is this:
>
>         return (chunk_alloc_core(size, alignment, false, zero,
>             arenas[arena_ind]->dss_prec));
>
> It seems that "arenas" is not properly initialized, but only with hugepages.
>
> Thank you for any help.
>
> _______________________________________________
> jemalloc-discuss mailing list
> jemalloc-discuss at canonware.com
> http://www.canonware.com/mailman/listinfo/jemalloc-discuss
>



-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/


More information about the jemalloc-discuss mailing list