- Mar 25, 2011
-
-
Jason Evans authored
Add inline assembly implementations of atomic_{add,sub}_uint{32,64}() for x86/x64, in order to support compilers that are missing the relevant gcc intrinsics.
-
- Mar 23, 2011
-
-
Jason Evans authored
This reverts commit adc675c8. The original commit added support for a non-standard libunwind API, so it was not of general utility.
-
- Mar 24, 2011
-
-
Jason Evans authored
arena_purge() may be called even when there are no dirty pages, so loosen an assertion accordingly.
-
je@facebook.com authored
Use libunwind's unw_tdep_trace() if it is available.
-
- Mar 23, 2011
-
-
Jason Evans authored
sa2u() returns 0 on overflow, but the profiling code was blindly calling sa2u() and allowing the error to silently propagate, ultimately ending in a later assertion failure. Refactor all ipalloc() callers to call sa2u(), check for overflow before calling ipalloc(), and pass usize rather than size. This allows ipalloc() to avoid calling sa2u() in the common case.
-
Jason Evans authored
Add code to set *rsize even when profiling is enabled.
-
Jason Evans authored
Initialize arenas_tsd earlier, so that the non-TLS case works when profiling is enabled.
-
- Mar 22, 2011
-
-
Jason Evans authored
-
Jason Evans authored
Fix a regression due to: Remove an arena_bin_run_size_calc() constraint. 2a6f2af6 The removed constraint required that small run headers fit in one page, which indirectly limited runs such that they would not cause overflow in arena_run_regind(). Add an explicit constraint to arena_bin_run_size_calc() based on the largest number of regions that arena_run_regind() can handle (2^11 as currently configured).
-
- Mar 21, 2011
-
-
Jason Evans authored
Dynamically adjust tcache fill count (number of objects allocated per tcache refill) such that if GC has to flush inactive objects, the fill count gradually decreases. Conversely, if refills occur while the fill count is depressed, the fill count gradually increases back to its maximum value.
-
- Mar 19, 2011
-
-
Jason Evans authored
pthread_mutex_lock() can call malloc() on OS X (!!!), which causes deadlock. Work around this by using spinlocks that are built of more primitive stuff.
-
Jason Evans authored
-
Jason Evans authored
Import updated pprof from google-perftools 1.7.
-
Jason Evans authored
Add atomic.[ch], which should have been part of the previous commit.
-
Jason Evans authored
Add the "stats.cactive" mallctl, which can be used to efficiently and repeatedly query approximately how much active memory the application is utilizing.
-
- Mar 18, 2011
-
-
Jason Evans authored
Rather than blindly assigning threads to arenas in round-robin fashion, choose the lowest-numbered arena that currently has the smallest number of threads assigned to it. Add the "stats.arenas.<i>.nthreads" mallctl.
-
Jason Evans authored
Refill the thread cache such that low regions get used first. This fixes a regression due to the recent transition to bitmap-based region management.
-
Jason Evans authored
The previous free list implementation, which embedded singly linked lists in available regions, had the unfortunate side effect of causing many cache misses during thread cache fills. Fix this in two places: - arena_run_t: Use a new bitmap implementation to track which regions are available. Furthermore, revert to preferring the lowest available region (as jemalloc did with its old bitmap-based approach). - tcache_t: Move read-only tcache_bin_t metadata into tcache_bin_info_t, and add a contiguous array of pointers to tcache_t in order to track cached objects. This substantially increases the size of tcache_t, but results in much higher data locality for common tcache operations. As a side benefit, it is again possible to efficiently flush the least recently used cached objects, so this change changes flushing from MRU to LRU. The new bitmap implementation uses a multi-level summary approach to make finding the lowest available region very fast. In practice, bitmaps only have one or two levels, though the implementation is general enough to handle extremely large bitmaps, mainly so that large page sizes can still be entertained. Fix tcache_bin_flush_large() to always flush statistics, in the same way that tcache_bin_flush_small() was recently fixed. Use JEMALLOC_DEBUG rather than NDEBUG. Add dassert(), and use it for debug-only asserts.
-
- Mar 16, 2011
-
-
Jason Evans authored
Clean up configuration for backtracing when profiling is enabled, and document the configuration logic in INSTALL. Disable libgcc-based backtracing except on x64 (where it is known to work). Add the --disable-prof-gcc option.
-
Jason Evans authored
Fix a couple of problems related to the addition of arena_bin_info_t.
-
- Mar 15, 2011
-
-
Jason Evans authored
Add missing error checks for pthread_mutex_init() calls. In practice, mutex initialization never fails, so this is merely good hygiene.
-
Jason Evans authored
Move read-only fields from arena_bin_t into arena_bin_info_t, primarily in order to avoid false cacheline sharing.
-
Jason Evans authored
Fix the automatic header dependency generation to handle the .pic.o suffix. This regression was due to: Build both PIC and no PIC static libraries af5d6987
-
Jason Evans authored
Convert all direct small_size2bin[...] accesses to SMALL_SIZE2BIN(...) macro calls, and use a couple of cheap math operations to allow compacting the table by 4X or 8X, on 32- and 64-bit systems, respectively.
-
Jason Evans authored
-
Jason Evans authored
Compile with -fvisibility=hidden rather than -fvisibility=internal, in order to avoid PLT lookups for internal functions. Also fix a regression that caused the -fvisibility flag to be omitted, due to: Port to Mac OS X. 2dbecf1f
-
Jason Evans authored
-
- Mar 14, 2011
-
-
Jason Evans authored
When a thread cache flushes objects to their arenas due to an abundance of cached objects, it merges the allocation request count for the associated size class, and increments a flush counter. If none of the flushed objects came from the thread's assigned arena, then the merging wouldn't happen (though the counter would typically eventually be merged), nor would the flush counter be incremented (a hard bug). Fix this via extra conditional code just after the flush loop.
-
Jason Evans authored
Fix a variable reversal bug in mallctl("thread.arena", ...).
-
- Mar 07, 2011
-
-
Jason Evans authored
Fix a cpp logic error that was introduced by the recent commit: Fix "thread.{de,}allocatedp" mallctl.
-
- Mar 02, 2011
-
-
je authored
-
Arun Sharma authored
When jemalloc is linked into an executable (as opposed to a shared library), compiling with -fno-pic can have significant advantages, mainly because we don't have to go throught the GOT (global offset table). Users who want to link jemalloc into a shared library that could be dlopened need to link with libjemalloc_pic.a or libjemalloc.so.
-
- Feb 14, 2011
-
-
Jason Evans authored
-
Jason Evans authored
For the non-TLS case (as on OS X), if the "thread.{de,}allocatedp" mallctl was called before any allocation occurred for that thread, the TSD was still NULL, thus putting the application at risk of dereferencing NULL. Fix this by refactoring the initialization code, and making it part of the conditional logic for all per thread allocation counter accesses.
-
- Feb 08, 2011
-
-
Jason Evans authored
-
- Feb 01, 2011
-
-
Jason Evans authored
-
Jason Evans authored
Fix huge_ralloc() to call huge_palloc() only if alignment requires it. This bug caused under-sized allocation for aligned huge reallocation (via rallocm()) if the requested alignment was less than the chunk size (4 MiB by default).
-
- Jan 26, 2011
-
-
Jason Evans authored
Fix ALLOCM_LG_ALIGN to take a parameter and use it. Apparently, an editing error left ALLOCM_LG_ALIGN with the same definition as ALLOCM_LG_ALIGN_MASK.
-
- Jan 15, 2011
-
-
Jason Evans authored
s/=/==/ in several assertions, as well as fixing spelling errors.
-
Jason Evans authored
Restructure the ctx initialization code such that the ctx isn't locked across portions of the initialization code where allocation could occur. Instead artificially inflate the cnt_merged.curobjs field, just as is done elsewhere to avoid similar races to the one that would otherwise be created by the reduction in locking scope. This bug affected interval- and growth-triggered heap dumping, but not manual heap dumping.
-