Guillaume Holley
gholley at cebitec.uni-bielefeld.de
Wed Aug 6 08:55:15 PDT 2014
Le 06/08/2014 01:10, Jason Evans a écrit :
> On Aug 5, 2014, at 10:35 AM, gholley at CeBiTec.Uni-Bielefeld.DE wrote:
>> I’m currently working on a data structure allowing the storage of a
>> dynamic set of short DNA sequences plus annotations.
>> Here are few details : the data structure is written in C, tests are
>> currently run on Ubuntu 14.04 64 bits, everything is single threaded and
>> Valgrind indicates that the program which manipulates the data structure
>> has no memory leaks.
>>
>> I’ve started to use Jemalloc in an attempt to reduce the fragmentation of
>> the memory (by using one arena, disabling the thread caching system and
>> using a high ratio of dirty pages). On small data sets (30 millions
>> insertions), results are very good in comparison of Glibc: about 150MB
>> less by using tuned Jemalloc.
>>
>> Now, I’ve started tests with much bigger data sets (3 to 10 billions
>> insertions) and I realized that Jemalloc is using more memory than Glibc.
>> I have generated a data set of 200 millions entries which I tried to
>> insert in the data structure and when the memory used reached 1GB, I
>> stopped the program and reported the number of entries inserted.
>> When using Jemalloc, doesn’t matter the tuning parameters (1 or 4 arenas,
>> tcache activated or not, lg_dirty = 3 or 8 or 16, lg_chunk = 14 or 22 or
>> 30), the number of entries inserted varies between 120 millions to 172
>> millions. Or by using the standard Glibc, I’m able to insert 187 millions
>> of entries.
>> And on billions of entries, Glibc (I don’t have precise numbers
>> unfortunately) uses few Gigabytes less than Jemalloc.
>>
>> So I would like to know if there is an explanation for this and if I can
>> do something to make Jemalloc at least as efficient as Glibc is on my
>> tests ? Maybe I’m not using Jemalloc correctly ?
> There are a few possible issues, mainly related to fragmentation, but I can't make many specific guesses because I don't know what the allocation/deallocation patterns are in your application. It sounds like your application just does a bunch of allocation, with very little interspersed deallocation, in which case I'm surprised by your results unless you happen to be allocating lots of objects that are barely larger than the nearest size class boundaries (e.g. 17 bytes). Have you taken a close look at the output of malloc_stats_print()?
>
> Jason
Hi and thank you for the help.
Well my application is doing like you said a lot of allocations with
very little deallocations, but the memory allocated is very often
reallocated. However, my application cannot allocate memory for more
than 600KB in one allocation, so no allocation of huge objects.
I tried to have a look at malloc_stats_print() (which is enclosed at the
end of my answer) and I see that for bins of size 64, it seems I make a
huge amount of allocations/reallocations for a small amount of memory
allocated, and maybe this generates a lot of fragmentation but I don't
know in which proportion. Do you think my problem could be linked to
this, and if yes, can I do something on Jemalloc to solve it ?
Here are the Jemalloc statistics for 200 millions insertions, Jemalloc
tuned with the following parameters :
"narenas:1,tcache:false,lg_dirty_mult:8"
___ Begin jemalloc statistics ___
Version: 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340
Assertions disabled
Run-time option settings:
opt.abort: false
opt.lg_chunk: 22
opt.dss: "secondary"
opt.narenas: 1
opt.lg_dirty_mult: 8
opt.stats_print: false
opt.junk: false
opt.quarantine: 0
opt.redzone: false
opt.zero: false
opt.tcache: false
opt.lg_tcache_max: 15
CPUs: 4
Arenas: 1
Pointer size: 8
Quantum size: 16
Page size: 4096
Min active:dirty page ratio per arena: 256:1
Maximum thread-cached size class: 32768
Chunk size: 4194304 (2^22)
Allocated: 1196728936, active: 1212567552, mapped: 1287651328
Current active ceiling: 16416505856
chunks: nchunks highchunks curchunks
307 307 307
huge: nmalloc ndalloc allocated
0 0 0
arenas[0]:
assigned threads: 1
dss allocation precedence: disabled
dirty pages: 296037:66 active:dirty, 25210 sweeps, 90046 madvises,
627169 purged
allocated nmalloc ndalloc nrequests
small: 1116037736 372536523 370598419 372536523
large: 80691200 223900 220617 223900
total: 1196728936 372760423 370819036 372760423
active: 1212567552
mapped: 1283457024
bins: bin size regs pgs allocated nmalloc ndalloc
nrequests nfills nflushes newruns reruns curruns
0 8 501 1 78696 1052422 1042585
1052422 0 0 127 694344 20
1 16 252 1 1578192 1017985 919348
1017985 0 0 533 708540 409
2 32 126 1 3514624 1123313 1013481
1123313 0 0 1210 867880 872
3 48 84 1 3035520 1016243 953003
1016243 0 0 1108 807914 754
4 64 63 1 10854720 352069628 351900023
352069628 0 0 54018 291399155 2981
5 80 50 1 3181120 911052 871288
911052 0 0 1096 734143 804
6 96 84 2 3104352 872936 840599
872936 0 0 536 678759 386
7 112 72 2 2807728 846038 820969
846038 0 0 505 664524 349
8 128 63 2 6818048 1020930 967664
1020930 0 0 1273 864533 846
9 160 51 2 36769280 1169391 939583
1169391 0 0 6277 854027 4507
10 192 63 3 6427584 995330 961853
995330 0 0 999 783827 544
11 224 72 4 6205472 915020 887317
915020 0 0 759 741292 422
12 256 63 4 7763456 1448444 1418118
1448444 0 0 1011 1402929 524
13 320 63 5 20894720 1038961 973665
1038961 0 0 1579 854341 1062
14 384 63 6 21324672 972166 916633
972166 0 0 1378 792419 900
15 448 63 7 22307264 916731 866938
916731 0 0 1283 737804 811
16 512 63 8 23352832 867656 822045
867656 0 0 1285 686944 749
17 640 51 8 52571520 826774 744631
826774 0 0 2379 614914 1616
18 768 47 9 53935872 937259 867030
937259 0 0 3608 810030 1495
19 896 45 10 51936640 677539 619574
677539 0 0 2016 473346 1289
20 1024 63 16 52584448 832395 781043
832395 0 0 1368 742251 816
21 1280 51 16 600357120 641800 172771
641800 0 0 9221 184785 9197
22 1536 42 16 62310912 174403 133836
174403 0 0 992 136077 966
23 1792 38 17 19763968 66986 55957
66986 0 0 295 52269 291
24 2048 65 33 12324864 45615 39597
45615 0 0 109 30775 93
25 2560 52 33 16240640 35532 29188
35532 0 0 131 22898 122
26 3072 43 33 8377344 24578 21851
24578 0 0 69 15804 64
27 3584 39 35 5616128 19396 17829
19396 0 0 51 11992 41
large: size pages nmalloc ndalloc nrequests curruns
4096 1 15792 15119 15792 673
8192 2 10900 9456 10900 1444
12288 3 4568 4413 4568 155
16384 4 3405 3344 3405 61
20480 5 2905 2848 2905 57
24576 6 2618 2571 2618 47
28672 7 57010 56965 57010 45
32768 8 2703 2666 2703 37
36864 9 2656 2613 2656 43
40960 10 2745 2703 2745 42
45056 11 2919 2862 2919 57
49152 12 3034 2990 3034 44
53248 13 3136 3090 3136 46
57344 14 3347 3302 3347 45
61440 15 3645 3606 3645 39
65536 16 3864 3832 3864 32
69632 17 3879 3844 3879 35
73728 18 3889 3861 3889 28
77824 19 3937 3908 3937 29
81920 20 3956 3910 3956 46
86016 21 3985 3958 3985 27
90112 22 4005 3974 4005 31
94208 23 3924 3896 3924 28
98304 24 3815 3786 3815 29
102400 25 3775 3751 3775 24
106496 26 3767 3740 3767 27
110592 27 4004 3980 4004 24
114688 28 4289 4273 4289 16
118784 29 4274 4259 4274 15
122880 30 4160 4150 4160 10
126976 31 4096 4089 4096 7
131072 32 3934 3925 3934 9
135168 33 3792 3785 3792 7
139264 34 3704 3701 3704 3
143360 35 3571 3566 3571 5
147456 36 3559 3557 3559 2
151552 37 3412 3410 3412 2
155648 38 3155 3155 3155 0
159744 39 2826 2823 2826 3
163840 40 2423 2422 2423 1
167936 41 2174 2173 2174 1
172032 42 1883 1881 1883 2
176128 43 1563 1563 1563 0
180224 44 1145 1143 1145 2
184320 45 736 735 736 1
188416 46 492 491 492 1
192512 47 300 300 300 0
196608 48 149 149 149 0
200704 49 56 56 56 0
204800 50 20 20 20 0
208896 51 2 2 2 0
212992 52 1 1 1 0
217088 53 1 0 1 1
[965]
--- End jemalloc statistics ---
Guillaume
More information about the jemalloc-discuss
mailing list