<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On May 7, 2013, at 1:16 PM, Thomas W Savage wrote:</div><blockquote type="cite"><div>My team is having trouble determining how to address increasing internal fragmentation (sizeable diff b/w Jm allocated and active) for a particular workload. We are allocating objects into three small bins (48, 320, 896). We start with an insertion phase in which we continually allocate "entries", which are made up of four allocations: 2x 48-byte objects, 1x 320 obj, and 1x 896 obj. Once we have inserted entries up to a certain threshold, we begin an eviction phase in which we have some threads continuing insertion and another thread freeing 320's and 896's (not touching the 48's). By the end of this run, we observe significant internal fragmentation as demonstrated in the stats below. Is there anything that can be done to mitigate this internal frag? <tt>Version: 3.3.1-0-g9ef9d9e8c271cdf14f664b871a8f98c827714784 Assertions disabled Run-time option settings: opt.abort: false opt.lg_chunk: 21 opt.dss: "secondary" opt.narenas: 96 opt.lg_dirty_mult: 1 opt.stats_print: false opt.junk: false opt.quarantine: 0 opt.redzone: false opt.zero: false CPUs: 24 Arenas: 96 Pointer size: 8 Quantum size: 16 Page size: 4096 Min active:dirty page ratio per arena: 2:1 Chunk size: 2097152 (2ˆ21) Allocated: 7574200736, active: 8860864512, mapped: 9013559296 Current active ceiling: 8963227648 chunks: nchunks highchunks curchunks 4553 4298 4298 huge: nmalloc ndalloc allocated 16 15 35651584 Merged arenas stats: assigned threads: 79 dss allocation precedence: N/A dirty pages: 2154593:0 active:dirty, 0 sweeps, 0 madvises, 0 purged allocated nmalloc ndalloc nrequests small: 7515054496 29540988 3552884 29540988 large: 23494656 1432 0 1432 total: 7538549152 29542420 3552884 29542420 active: 8825212928 mapped: 8973713408 bins: bin size regs pgs allocated nmalloc ndalloc newruns reruns curruns 0 8 501 1 176 22 0 11 0 11 [1] 2 32 126 1 68448 2187 48 22 0 21 3 48 84 1 13880077 0 165272 0 165272 [4] 5 80 50 1 1760 22 0 11 0 11 6 96 84 2 2112 22 0 11 0 11 [7..12] 13 320 63 5 2221154560 8717502 1776394 125156 701794 125156 [14..18] 19 896 45 10 4627583744 6941156 1776442 135776 692084 135774 [20..27] large: size pages nmalloc ndalloc nrequests curruns [1] 8192 2 22 0 22 22 [1] 16384 4 1408 0 1408 1408 [13] 73728 18 1 0 1 1 [23] 172032 42 1 0 1 1 [467] --- End jemalloc statistics --- </tt> </div></blockquote></div>The external fragmentation for 320- and 896-byte region runs is 12% and 15%, respectively. First off, that doesn't strike me as terrible, depending on the details of what's going on in the application. There are two possible explanations (not mutually exclusive): 1) the application's memory usage is not at the high water mark, and 2) the eviction thread does not evict in a pattern that impacts the allocating threads proportionally to their allocation volumes. Say that there are two arenas, and 75% of the evictions are objects allocated from arena 0, but arenas 0 and 1 are utilized equally by the allocating threads. The result will be substantial arena 0 external fragmentation in the equilibrium state. You can figure out whether (2) is a factor by running with one arena (which will surely impact performance, since you have thread caching disabled). If fragmentation remains the same with one arena, then (1) is the entire explanation.<div> </div><div>One possible solution that should be allocator-agnostic would be to interleave eviction with normal allocation in all threads, such that threads evict their own previous allocations at a rate proportional to their allocation rates. This changes the global eviction policy to one that is distributed though, so it may not be appropriate, depending on what your application does.</div><div> </div><div>Jason</div></body></html>