Commits · 213667fe26ee30cb724d390c6047821960c57a34 · ALEIX ROCA NONELL / jemalloc-mod

Nov 04, 2016

Update ChangeLog for 4.3.0. · 213667fe
Jason Evans authored Nov 04, 2016

213667fe

Fix large allocation to search optimal size class heap. · 32896a90

Jason Evans authored Nov 03, 2016

Fix arena_run_alloc_large_helper() to not convert size to usize when
searching for the first best fit via arena_run_first_best_fit().  This
allows the search to consider the optimal quantized size class, so that
e.g. allocating and deallocating 40 KiB in a tight loop can reuse the
same memory.

This regression was nominally caused by
5707d6f9 (Quantize szad trees by size
class.), but it did not commonly cause problems until
8a03cf03 (Implement cache index
randomization for large allocations.).  These regressions were first
released in 4.0.0.

This resolves #487.

32896a90

Fix chunk_alloc_cache() to support decommitted allocation. · e9012630

Jason Evans authored Nov 03, 2016

Fix chunk_alloc_cache() to support decommitted allocation, and use this
ability in arena_chunk_alloc_internal() and arena_stash_dirty(), so that
chunks don't get permanently stuck in a hybrid state.

This resolves #487.

e9012630

Nov 03, 2016
- Update symbol mangling. · dd3ed23a
  Jason Evans authored Nov 03, 2016
  
  dd3ed23a
- Update ChangeLog for 4.3.0. · 1ceae2f8
  Jason Evans authored Nov 02, 2016
  
  1ceae2f8
- Update project URL. · 62de7680
  Jason Evans authored Sep 12, 2016
  
  62de7680
- Check for existance of CPU_COUNT macro before using it. · 6c56e194
  Dave Watson authored Nov 02, 2016
```
This resolves #485.
```
  6c56e194
- Fix sycall(2) configure test for Linux. · eca3bc01
  Jason Evans authored Nov 02, 2016
  
  eca3bc01
- Do not use syscall(2) on OS X 10.12 (deprecated). · da206df1
  Jason Evans authored Nov 02, 2016
  
  da206df1
- Add os_unfair_lock support. · 3f2b8d9c
  Jason Evans authored Nov 02, 2016
```
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
```
  3f2b8d9c
- Fix/refactor zone allocator integration code. · a99e0fa2
  Jason Evans authored Nov 02, 2016
```
Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes,
since OS X 10.12 cannot tolerate a child unlocking mutexes that were
locked by its parent.

Refactor; this was a side effect of experimenting with zone
{de,re}registration during fork(2).
```
  a99e0fa2
- Call _exit(2) rather than exit(3) in forked child. · 31db315f
  Jason Evans authored Nov 02, 2016
```
_exit(2) is async-signal-safe, whereas exit(3) is not.
```
  31db315f
Nov 02, 2016

Force no lazy-lock on Windows. · 07ee4c5f

Jason Evans authored Nov 02, 2016

Monitoring thread creation is unimplemented for Windows, which means
lazy-lock cannot function correctly.

This resolves #310.

07ee4c5f

Nov 01, 2016
- Use <quote>...</quote> rather than “...” or "..." in XML. · f19bedb0
  Jason Evans authored Nov 01, 2016
  
  f19bedb0
- Add "J" (JSON) support to malloc_stats_print(). · b599b322
  Jason Evans authored Nov 01, 2016
```
This resolves #474.
```
  b599b322
Oct 31, 2016
- Refactor witness_unlock() to fix undefined test behavior. · 4752a54e
  Jason Evans authored Oct 31, 2016
```
This resolves #396.
```
  4752a54e
Oct 30, 2016

Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW. · 1d57c03e

Jason Evans authored Oct 29, 2016

The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC),
whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but
still has resolution (~1ms) that is adequate for our purposes.

This resolves #479.

1d57c03e

Use syscall(2) rather than {open,read,close}(2) during boot. · c443b675

Jason Evans authored Oct 29, 2016

Some applications wrap various system calls, and if they call the
allocator in their wrappers, unexpected reentry can result.  This is not
a general solution (many other syscalls are spread throughout the code),
but this resolves a bootstrapping issue that is apparently common.

This resolves #443.

c443b675

Fix EXTRA_CFLAGS to not affect configuration. · 35a108c8
Jason Evans authored Oct 29, 2016

35a108c8

Oct 29, 2016

Do not mark malloc_conf as weak on Windows. · e46f8f97

Jason Evans authored Oct 28, 2016

This works around malloc_conf not being properly initialized by at least
the cygwin toolchain.  Prior build system changes to use
-Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to
work properly as a non-weak symbol (not tested).

e46f8f97

Do not mark malloc_conf as weak for unit tests. · 35799a50

Jason Evans authored Oct 28, 2016

This is generally correct (no need for weak symbols since no jemalloc
library is involved in the link phase), and avoids linking problems
(apparently unininitialized non-NULL malloc_conf) when using cygwin with
gcc.

35799a50

Support static linking of jemalloc with glibc · ed84764a

Dave Watson authored Oct 28, 2016

glibc defines its malloc implementation with several weak and strong
symbols:

strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc)
strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree)
strong_alias (__libc_free, __free) strong_alias (__libc_free, free)
strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)

The issue is not with the weak symbols, but that other parts of glibc
depend on __libc_malloc explicitly.  Defining them in terms of jemalloc
API's allows the linker to drop glibc's malloc.o completely from the link,
and static linking no longer results in symbol collisions.

Another wrinkle: jemalloc during initialization calls sysconf to
get the number of CPU's.  GLIBC allocates for the first time before
setting up isspace (and other related) tables, which are used by
sysconf.  Instead, use the pthread API to get the number of
CPUs with GLIBC, which seems to work.

This resolves #442.

ed84764a

Oct 28, 2016

Reduce memory requirements for regression tests. · b99c72f3

Jason Evans authored Oct 28, 2016

This is intended to drop memory usage to a level that AppVeyor test
instances can handle.

This resolves #393.

b99c72f3

Periodically purge in memory-intensive integration tests. · eaecaad8
Jason Evans authored Oct 28, 2016
```
This resolves #393.
```
eaecaad8
Periodically purge in memory-intensive integration tests. · 2c53faf3
Jason Evans authored Oct 28, 2016
```
This resolves #393.
```
2c53faf3
Only link with libm (-lm) if necessary. · e7d67799
Jason Evans authored Oct 27, 2016
```
This fixes warnings when building with MSVC.
```
e7d67799

Only use --whole-archive with gcc. · 875ff15e

Jason Evans authored Oct 27, 2016

Conditionalize use of --whole-archive on the platform plus compiler,
rather than on the ABI.  This fixes a regression caused by
7b24c6e5 (Use --whole-archive when
linking integration tests on MinGW.).

875ff15e

Do not force lazy lock on Windows. · 1eb801bc

Jason Evans authored Oct 27, 2016

This reverts 13473c7c, which was
intended to work around bootstrapping issues when linking statically.
However, this actually causes problems in various other configurations,
so this reversion may force a future fix for the underlying problem, if
it still exists.

1eb801bc

Fix over-sized allocation of rtree leaf nodes. · dc553d52

Jason Evans authored Oct 28, 2016

Use the correct level metadata when allocating child nodes so that leaf
nodes don't end up over-sized (2^16 elements vs 2^4 elements).

dc553d52

Oct 26, 2016

Use --whole-archive when linking integration tests on MinGW. · 5569b4a4

Jason Evans authored Oct 25, 2016

Prior to this change, the malloc_conf weak symbol provided by the
jemalloc dynamic library is always used, even if the application
provides a malloc_conf symbol.  Use the --whole-archive linker option
to allow the weak symbol to be overridden.

5569b4a4

Oct 21, 2016

Do not (recursively) allocate within tsd_fetch(). · 962a2979

Jason Evans authored Oct 20, 2016

Refactor tsd so that tsdn_fetch() does not trigger allocation, since
allocation could cause infinite recursion.

This resolves #458.

962a2979

Oct 14, 2016

Make dss operations lockless. · e2bcf037

Jason Evans authored Oct 13, 2016

Rather than protecting dss operations with a mutex, use atomic
operations.  This has negligible impact on synchronization overhead
during typical dss allocation, but is a substantial improvement for
chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be
called multiple times during chunk deallocations.

This change also has the advantage of avoiding tsd in deallocation paths
associated with purging, which resolves potential deadlocks during
thread exit due to attempted tsd resurrection.

This resolves #425.

e2bcf037

Oct 13, 2016

Add/use adaptive spinning. · 97376859

Jason Evans authored Oct 13, 2016

Add spin_t and spin_{init,adaptive}(), which provide a simple
abstraction for adaptive spinning.

Adaptively spin during busy waits in bootstrapping and rtree node
initialization.

97376859

Disallow 0x5a junk filling when running in Valgrind. · a2539fab

Jason Evans authored Oct 12, 2016

Explicitly disallow junk:true and junk:free runtime settings when
running in Valgrind, since deallocation-time junk filling and redzone
validation cause false positive Valgrind reports.

This resolves #470.

a2539fab

Oct 12, 2016

Fix and simplify decay-based purging. · d419bb09

Jason Evans authored Oct 11, 2016

Simplify decay-based purging attempts to only be triggered when the
epoch is advanced, rather than every time purgeable memory increases.
In a correctly functioning system (not previously the case; see below),
this only causes a behavior difference if during subsequent purge
attempts the least recently used (LRU) purgeable memory extent is
initially too large to be purged, but that memory is reused between
attempts and one or more of the next LRU purgeable memory extents are
small enough to be purged.  In practice this is an arbitrary behavior
change that is within the set of acceptable behaviors.

As for the purging fix, assure that arena->decay.ndirty is recorded
*after* the epoch advance and associated purging occurs.  Prior to this
fix, it was possible for purging during epoch advance to cause a
substantially underrepresentative (arena->ndirty - arena->decay.ndirty),
i.e. the number of dirty pages attributed to the current epoch was too
low, and a series of unintended purges could result.  This fix is also
relevant in the context of the simplification described above, but the
bug's impact would be limited to over-purging at epoch advances.

d419bb09

Fix decay tests to all adapt to nstime_monotonic(). · a14712b4
Jason Evans authored Oct 11, 2016

a14712b4

Oct 11, 2016
- Do not advance decay epoch when time goes backwards. · 45a5bf67
  Jason Evans authored Oct 10, 2016
```
Instead, move the epoch backward in time.  Additionally, add
nstime_monotonic() and use it in debug builds to assert that time only
goes backward if nstime_update() is using a non-monotonic time source.
```
  45a5bf67
- Refactor arena->decay_* into arena->decay.* (arena_decay_t). · 94e7ffa9
  Jason Evans authored Oct 10, 2016
  
  94e7ffa9
Oct 10, 2016

Refine nstime_update(). · b732c395

Jason Evans authored Oct 07, 2016

Add missing #include <time.h>.  The critical time facilities appear to
have been transitively included via unistd.h and sys/time.h, but in
principle this omission was capable of having caused
clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of
gettimeofday(), which in turn could cause spurious non-monotonic time
updates.

Refactor nstime_get() out of nstime_update() and add configure tests for
all variants.

Add CLOCK_MONOTONIC_RAW support (Linux-specific) and
mach_absolute_time() support (OS X-specific).

Do not fall back to clock_gettime(CLOCK_REALTIME, ...).  This was a
fragile Linux-specific workaround, which we're unlikely to use at all
now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we
have no choice besides non-monotonic clocks, gettimeofday() is only
incrementally worse.

b732c395

Oct 07, 2016
- Simplify run quantization. · 5d8db15d
  Jason Evans authored Apr 08, 2016
  
  5d8db15d