Memory allocation/release hooks
shamisp at ornl.gov
Tue Oct 20 12:18:41 PDT 2015
Thanks for the link, seems like a very useful library.
Our goal is a bit different (and very simple/basic).
We are looking for a malloc library that we can use for integration with our registration cache.
Essentially, it redirects application's malloc() calls to (through LD_PRELOAD or rpath) jemalloc that is hooked up with a cache (just like in HPX).
At this stage we don't play with locality.
Pavel (Pasha) Shamis
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory
On Oct 20, 2015, at 11:31 AM, Jeff Hammond <jeff.science at gmail.com<mailto:jeff.science at gmail.com>> wrote:
You may find http://memkind.github.io/memkind/ relevant. In particular, http://memkind.github.io/memkind/memkind_arch_20150318.pdf section 2.2 and 2.3 discusses exactly the issues you raise. We also note that memkind is intended to support multiple types of memory within a node, such as one might encounter in a platform such as Knights Landing. You are free to imagine how it might map to OpenPOWER based upon your superior knowledge of that platform :-)
While I recognize that the origins of memkind at Intel may pose a challenge for some in the OpenPOWER family, it would be tremendously valuable to the community if it was reused for MPI and OpenSHMEM projects, rather than the UCX team trying to implement something new. As you know, the both MPI and OpenSHMEM should run on a range of platforms, and it doubles the implementation effort in all relevant projects (MPICH, OpenMPI, OpenSHMEM reference, etc.) if UCX goes in a different direction.
I would be happy to introduce you to the memkind developers (I am not one of them, just someone who helps them understand user/client requirements).
On Thu, Oct 15, 2015 at 8:45 AM, Shamis, Pavel <shamisp at ornl.gov<mailto:shamisp at ornl.gov>> wrote:
Dear Jemalloc Community,
We are developer of UCX project  and as part of the effort
we are looking for a malloc library that supports hooks for alloc/dealloc chunks and can be used for the following:
(a) Allocation of memory that can be shared transparently between processes on the same node. For this purpose we would like to mmap memory with MAP_SHARED. This is very useful for implementation for Remote Memory Access (RMA) operations in MPI-3 one-sided  and OpenSHMEM  communication libraries. This allow a remote process to map user allocated memory and provide RMA operations through memcpy().
(b) Implementation of memory de-allocation hooks for RDMA hardware (Infiniband, ROCE, iWarp etc.). For optimization purpose we implement a lazy memory de-registration (memory unpinning) policy and we use the hook for the notification of communication library about memory release event. On the event, we cleanup our registration cache and de-register (unpin) the memory on hardware.
Based on this requirements we would like to understand what is the best approach for integration this functionality within jemalloc.
Pasha & Yossi
 OpenUCX: https://github.com/openucx/ucx or www.openucx.org<http://www.openucx.org/>
 MPI SPEC: http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf
 OpenSHMEM SPEC: http://bongo.cs.uh.edu/site/sites/default/site_files/openshmem-specification-1.2.pdf
jemalloc-discuss mailing list
jemalloc-discuss at canonware.com<mailto:jemalloc-discuss at canonware.com>
jeff.science at gmail.com<mailto:jeff.science at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the jemalloc-discuss