mremap with modern Linux kernel

Fri Apr 25 15:21:53 PDT 2014

On Apr 25, 2014, at 2:55 AM, Daniel Micay <danielmicay at gmail.com> wrote:
> This option was originally disabled by default due to fragmentation
> issues. It provides a significant performance win for Rust's vectors at
> very large sizes, so I'm curious about the severity of this issue and
> whether it is still around in the latest Linux kernel releases.

As far as I know, this problem still exists in Linux.  The problem is that Linux doesn't have a reliable way to find the first fit for an mmap() request other than linear scan, so it uses heuristics to decide where to start the scan.  It's quite easy to trigger pathological behavior where a chunk of memory is unmapped, but the kernel doesn't revise its scan start point, and the VM map hole remains indefinitely.  The more holes there are, the more mapped regions there are to linearly scan.  I don't remember what the common triggers of linear scans are, but they definitely happen enough to cause a performance issue, at least for some of the heavily loaded network server applications Facebook runs.

One way to reduce the impact of huge reallocs would be to use exponential size class increases, rather than linear increases.  jemalloc will always round up to the nearest multiple of the chunk size, but it it were to instead use e.g. [4, 8, 16, 32, 64, ...] MiB as size classes, the realloc overhead would amortize away.  I've been thinking about exploring this strategy for large size classes, [4 KiB .. 4 MiB), and I just wrote up a tracking issue that also keeps your use case in mind: 

	https://github.com/jemalloc/jemalloc/issues/77

Thanks,
Jason