Commits · 3e292475ee336daa8c34498e461141808ca13bee · ALEIX ROCA NONELL / jemalloc-mod

Mar 25, 2011

Implement atomic operations for x86/x64. · 3e292475

Jason Evans authored Mar 24, 2011

Add inline assembly implementations of atomic_{add,sub}_uint{32,64}()
for x86/x64, in order to support compilers that are missing the relevant
gcc intrinsics.

3e292475

Mar 23, 2011

Revert "Add support for libunwind backtrace caching." · 9f949f9d

Jason Evans authored Mar 22, 2011

This reverts commit adc675c8.

The original commit added support for a non-standard libunwind API, so
it was not of general utility.

9f949f9d

Mar 24, 2011
- Merge branch 'arena_purge' into dev · 69c04729
  Jason Evans authored Mar 23, 2011
  
  69c04729
- Fix an assertion in arena_purge(). · af8ad3ec
  Jason Evans authored Mar 23, 2011
```
arena_purge() may be called even when there are no dirty pages, so
loosen an assertion accordingly.
```
  af8ad3ec
- Add support for libunwind backtrace caching. · adc675c8
  je@facebook.com authored Jun 04, 2010
```
Use libunwind's unw_tdep_trace() if it is available.
```
  adc675c8
Mar 23, 2011

Fix error detection for ipalloc() when profiling. · 38d9210c

Jason Evans authored Mar 23, 2011

sa2u() returns 0 on overflow, but the profiling code was blindly calling
sa2u() and allowing the error to silently propagate, ultimately ending
in a later assertion failure.  Refactor all ipalloc() callers to call
sa2u(), check for overflow before calling ipalloc(), and pass usize
rather than size.  This allows ipalloc() to avoid calling sa2u() in the
common case.

38d9210c

Fix rallocm() rsize bug. · eacb896c
Jason Evans authored Mar 23, 2011
```
Add code to set *rsize even when profiling is enabled.
```
eacb896c

Fix bootstrapping order bug. · c957398b

Jason Evans authored Mar 23, 2011

Initialize arenas_tsd earlier, so that the non-TLS case works when
profiling is enabled.

c957398b

Mar 22, 2011

Update ChangeLog for 2.2.0. · 4bcd9872
Jason Evans authored Mar 22, 2011

4bcd9872

Avoid overflow in arena_run_regind(). · 47e57f9b

Jason Evans authored Mar 22, 2011

Fix a regression due to:
    Remove an arena_bin_run_size_calc() constraint.
    2a6f2af6
The removed constraint required that small run headers fit in one page,
which indirectly limited runs such that they would not cause overflow in
arena_run_regind().  Add an explicit constraint to
arena_bin_run_size_calc() based on the largest number of regions that
arena_run_regind() can handle (2^11 as currently configured).

47e57f9b

Mar 21, 2011

Dynamically adjust tcache fill count. · 1dcb4f86

Jason Evans authored Mar 21, 2011

Dynamically adjust tcache fill count (number of objects allocated per
tcache refill) such that if GC has to flush inactive objects, the fill
count gradually decreases.  Conversely, if refills occur while the fill
count is depressed, the fill count gradually increases back to its
maximum value.

1dcb4f86

Mar 19, 2011

Use OSSpinLock*() for locking on OS X. · 893a0ed7

Jason Evans authored Mar 18, 2011

pthread_mutex_lock() can call malloc() on OS X (!!!), which causes
deadlock.  Work around this by using spinlocks that are built of more
primitive stuff.

893a0ed7

Add atomic operation support for OS X. · 763baa6c
Jason Evans authored Mar 18, 2011

763baa6c
Update pprof. · 9a8fc41b
Jason Evans authored Mar 18, 2011
```
Import updated pprof from google-perftools 1.7.
```
9a8fc41b

Add atomic.[ch]. · 92d3284f

Jason Evans authored Mar 18, 2011

Add atomic.[ch], which should have been part of the previous commit.

92d3284f

Add the "stats.cactive" mallctl. · 0657f12a

Jason Evans authored Mar 18, 2011

Add the "stats.cactive" mallctl, which can be used to efficiently and
repeatedly query approximately how much active memory the application is
utilizing.

0657f12a

Mar 18, 2011

Improve thread-->arena assignment. · 597632be

Jason Evans authored Mar 18, 2011

Rather than blindly assigning threads to arenas in round-robin fashion,
choose the lowest-numbered arena that currently has the smallest number
of threads assigned to it.

Add the "stats.arenas.<i>.nthreads" mallctl.

597632be

Reverse tcache fill order. · 9c43c13a

Jason Evans authored Mar 18, 2011

Refill the thread cache such that low regions get used first.  This
fixes a regression due to the recent transition to bitmap-based region
management.

9c43c13a

Use bitmaps to track small regions. · 84c8eefe

Jason Evans authored Mar 16, 2011

The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills.  Fix this in two places:

  - arena_run_t: Use a new bitmap implementation to track which regions
                 are available.  Furthermore, revert to preferring the
                 lowest available region (as jemalloc did with its old
                 bitmap-based approach).

  - tcache_t: Move read-only tcache_bin_t metadata into
              tcache_bin_info_t, and add a contiguous array of pointers
              to tcache_t in order to track cached objects.  This
              substantially increases the size of tcache_t, but results
              in much higher data locality for common tcache operations.
              As a side benefit, it is again possible to efficiently
              flush the least recently used cached objects, so this
              change changes flushing from MRU to LRU.

The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast.  In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.

Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.

Use JEMALLOC_DEBUG rather than NDEBUG.

Add dassert(), and use it for debug-only asserts.

84c8eefe

Mar 16, 2011

Improve backtracing-related configuration. · 77f350be

Jason Evans authored Mar 15, 2011

Clean up configuration for backtracing when profiling is enabled, and
document the configuration logic in INSTALL.

Disable libgcc-based backtracing except on x64 (where it is known to
work).

Add the --disable-prof-gcc option.

77f350be

Clean up after arena_bin_info_t change. · b602daa6
Jason Evans authored Mar 15, 2011
```
Fix a couple of problems related to the addition of arena_bin_info_t.
```
b602daa6

Mar 15, 2011

Add missing error checks. · 819d11be

Jason Evans authored Mar 15, 2011

Add missing error checks for pthread_mutex_init() calls.  In practice,
mutex initialization never fails, so this is merely good hygiene.

819d11be

Create arena_bin_info_t. · 49f7e8f3

Jason Evans authored Mar 15, 2011

Move read-only fields from arena_bin_t into arena_bin_info_t, primarily
in order to avoid false cacheline sharing.

49f7e8f3

Fix a build dependency regression. · 1b17768e

Jason Evans authored Mar 15, 2011

Fix the automatic header dependency generation to handle the .pic.o
suffix.  This regression was due to:
    Build both PIC and no PIC static libraries
    af5d6987

1b17768e

Reduce size of small_size2bin lookup table. · 41ade967

Jason Evans authored Mar 06, 2011

Convert all direct small_size2bin[...] accesses to SMALL_SIZE2BIN(...)
macro calls, and use a couple of cheap math operations to allow
compacting the table by 4X or 8X, on 32- and 64-bit systems,
respectively.

41ade967

Expand a comment regarding geometric sampling. · ff745072
Jason Evans authored Mar 14, 2011

ff745072

Set default symbol visibility to hidden. · fa5d245a

Jason Evans authored Mar 15, 2011

Compile with -fvisibility=hidden rather than -fvisibility=internal, in
order to avoid PLT lookups for internal functions.  Also fix a
regression that caused the -fvisibility flag to be omitted, due to:
    Port to Mac OS X.
    2dbecf1f

fa5d245a

Update ChangeLog for 2.1.3. · 0e4d0d13
Jason Evans authored Mar 14, 2011

0e4d0d13

Mar 14, 2011

Fix a thread cache stats merging bug. · a8118233

Jason Evans authored Mar 14, 2011

When a thread cache flushes objects to their arenas due to an abundance
of cached objects, it merges the allocation request count for the
associated size class, and increments a flush counter. If none of the
flushed objects came from the thread's assigned arena, then the merging
wouldn't happen (though the counter would typically eventually be
merged), nor would the flush counter be incremented (a hard bug). Fix
this via extra conditional code just after the flush loop.

a8118233

Fix a "thread.arena" mallctl bug. · a7153a0d
Jason Evans authored Mar 14, 2011
```
Fix a variable reversal bug in mallctl("thread.arena", ...).
```
a7153a0d

Mar 07, 2011

Fix a cpp logic regression. · 814b9bda

Jason Evans authored Mar 06, 2011

Fix a cpp logic error that was introduced by the recent commit:
	Fix "thread.{de,}allocatedp" mallctl.

814b9bda

Mar 02, 2011

Update ChangeLog for 2.1.2. · 6e56e5ec
je authored Mar 02, 2011

6e56e5ec

Build both PIC and no PIC static libraries · af5d6987

Arun Sharma authored Feb 15, 2011

When jemalloc is linked into an executable (as opposed to a shared
library), compiling with -fno-pic can have significant advantages,
mainly because we don't have to go throught the GOT (global offset
table).

Users who want to link jemalloc into a shared library that could
be dlopened need to link with libjemalloc_pic.a or libjemalloc.so.

af5d6987

Feb 14, 2011

Fix style nits. · 655f04a5
Jason Evans authored Feb 13, 2011

655f04a5

Fix "thread.{de,}allocatedp" mallctl. · 9dcad2df

Jason Evans authored Feb 13, 2011

For the non-TLS case (as on OS X), if the "thread.{de,}allocatedp"
mallctl was called before any allocation occurred for that thread, the
TSD was still NULL, thus putting the application at risk of
dereferencing NULL.  Fix this by refactoring the initialization code,
and making it part of the conditional logic for all per thread
allocation counter accesses.

9dcad2df

Feb 08, 2011
- Add release dates to ChangeLog. · 6369286f
  Jason Evans authored Feb 07, 2011
  
  6369286f
Feb 01, 2011

Update ChangeLog for 2.1.1. · ada55b2e
Jason Evans authored Jan 31, 2011

ada55b2e

Fix an alignment-related bug in huge_ralloc(). · 31bfb3e7

Jason Evans authored Jan 31, 2011

Fix huge_ralloc() to call huge_palloc() only if alignment requires it.
This bug caused under-sized allocation for aligned huge reallocation
(via rallocm()) if the requested alignment was less than the chunk size
(4 MiB by default).

31bfb3e7

Jan 26, 2011

Fix ALLOCM_LG_ALIGN definition. · f256680f

Jason Evans authored Jan 26, 2011

Fix ALLOCM_LG_ALIGN to take a parameter and use it.  Apparently, an
editing error left ALLOCM_LG_ALIGN with the same definition as
ALLOCM_LG_ALIGN_MASK.

f256680f

Jan 15, 2011
- Fix assertion typos. · dbd3832d
  Jason Evans authored Jan 14, 2011
```
s/=/==/ in several assertions, as well as fixing spelling errors.
```
  dbd3832d