- Sep 12, 2016
-
-
Jason Evans authored
-
- Jul 08, 2016
-
-
Mike Hommey authored
On OSX 10.12, malloc_default_zone returns a special zone that is not present in the list of registered zones. That zone uses a "lite zone" if one is present (apparently enabled when malloc stack logging is enabled), or the first registered zone otherwise. In practice this means unless malloc stack logging is enabled, the first registered zone is the default. So get the list of zones to get the first one, instead of relying on malloc_default_zone.
-
Mike Hommey authored
847ff223 added a call to malloc_default_zone() before the main loop in register_zone, effectively making malloc_default_zone() called twice without any different outcome expected in the returned result. It is also called once at the beginning, and a second time at the end of the loop block. Instead, call it only once per iteration.
-
Elliot Ronaghan authored
Cray is pretty warning-happy, so disable ones that aren't helpful. Each warning has a numeric value instead of having named flags to disable specific warnings. Disable warnings 128 and 1357. 128: Ignore unreachable code warning. Cray warns about `not_reached()` not being reachable in a couple of places because it detects that some loops will never terminate. 1357: Ignore warning about redefinition of malloc and friends With this patch, Cray 8.4.0 and 8.5.1 build cleanly and pass `make check`
-
- Jul 07, 2016
-
-
Elliot Ronaghan authored
Cray uses -herror_on_warning instead of -Werror. Use it everywhere -Werror is currently used for __attribute__ checks so configure actually detects they're not supported.
-
Elliot Ronaghan authored
Cray only supports `-M` for generating dependency files. It does not support `-MM` or `-MT`, so don't try to use them. I just reused the existing mechanism for turning auto-dependency generation off (`CC_MM=`), but it might be more principled to add a configure test to check if the compiler supports `-MM` and `-MT`, instead of manually tracking which compilers don't support those flags.
-
Elliot Ronaghan authored
Get jemalloc building and passing `make check_unit` with cray 8.4. An inlining bug in 8.4 results in internal errors while trying to build jemalloc. This has already been reported and fixed for the 8.5 release. In order to work around the inlining bug, disable gnu compatibility and limit ipa optimizations. I copied the msvc compiler check for cray, but note that we perform the test even if we think we're using gcc because cray pretends to be gcc if `-hgnu` (which is enabled by default) is used. I couldn't come up with a principled way to check for the inlining bug, so instead I just checked compiler versions. The build had lots of warnings I need to address and cray doesn't support -MM or -MT for dependency tracking, so I had to do `make CC_MM=`.
-
rustyx authored
-
Elliot Ronaghan authored
Add a configure check for __builtin_unreachable instead of basing its availability on the __GNUC__ version. On OS X using gcc (a real gcc, not the bundled version that's just a gcc front-end) leads to a linker assertion: https://github.com/jemalloc/jemalloc/issues/266 It turns out that this is caused by a gcc bug resulting from the use of __builtin_unreachable(): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 To work around this bug, check that __builtin_unreachable() actually works at configure time, and if it doesn't use abort() instead. The check is based on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438#c21. With this `make check` passes with a homebrew installed gcc-5 and gcc-6.
-
Elliot Ronaghan authored
The Cray compiler wrappers will often add `-lrt` to the base compiler with `-static` linking (the default at most sites.) However, `-lrt` isn't automatically added with `-dynamic`. This means that if jemalloc was built with `-static`, but then used in a program with `-dynamic` jemalloc won't have detected that librt is a dependency. The integration and stress tests use -dynamic, which is causing undefined references to clock_gettime(). This just adds an extra check for librt (ignoring the autoconf cache) with `-dynamic` thrown. It also stops filtering librt from the integration tests. With this `make check` passes for: - PrgEnv-gnu - PrgEnv-intel - PrgEnv-pgi PrgEnv-cray still needs more work (will be in a separate patch.)
-
Elliot Ronaghan authored
Cray systems come with compiler wrappers to simplify building parallel applications. CC is the C++ wrapper, and cc is the C wrapper. The wrappers call the base {Cray, Intel, PGI, or GNU} compiler with vendor specific flags. The "Programming Environment" (prgenv) that's currently loaded determines the base compiler. e.g. compiling with gnu looks something like: module load PrgEnv-gnu cc hello.c On most systems the wrappers defaults to `-static` mode, which causes them to only look for static libraries, and not for any dynamic ones (even if the dynamic version was explicitly listed.) The integration and stress tests expect to be using the .so, so we have to run the with -dynamic so that wrapper will find/use the .so.
-
Mike Hommey authored
-
- Jun 09, 2016
-
-
Mike Hommey authored
They are used on all platforms in prng.h.
-
Mike Hommey authored
-
Mike Hommey authored
This builds jemalloc and runs all checks with: - MSVC 2015 64-bits - MSVC 2015 32-bits - MINGW64 (from msys2) - MINGW32 (from msys2) Normally, AppVeyor configs are named appveyor.yml, but it is possible to configure the .yml file name in the AppVeyor project settings such that the file stays "hidden", like typical travis configs.
-
- Jun 08, 2016
-
-
Elliot Ronaghan authored
Some bug (either in the red-black tree code, or in the pgi compiler) seems to cause red-black trees to become unbalanced. This issue seems to go away if we don't use compact red-black trees. Since red-black trees don't seem to be used much anymore, I opted for what seems to be an easy fix here instead of digging in and trying to find the root cause of the bug. Some context in case it's helpful: I experienced a ton of segfaults while using pgi as Chapel's target compiler with jemalloc 4.0.4. The little bit of debugging I did pointed me somewhere deep in red-black tree manipulation, but I didn't get a chance to investigate further. It looks like 4.2.0 replaced most uses of red-black trees with pairing-heaps, which seems to avoid whatever bug I was hitting. However, `make check_unit` was still failing on the rb test, so I figured the core issue was just being masked. Here's the `make check_unit` failure: ```sh === test/unit/rb === test_rb_empty: pass tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced node_remove:test/unit/rb.c:190: Failed assertion: (imbalances) == (0) --> 2 != 0: Tree is unbalanced <jemalloc>: test/unit/rb.c:43: Failed assertion: "pathp[-1].cmp < 0" test/test.sh: line 22: 12926 Aborted Test harness error ``` While starting to debug I saw the RB_COMPACT option and decided to check if turning that off resolved the bug. It seems to have fixed it (`make check_unit` passes and the segfaults under Chapel are gone) so it seems like on okay work-around. I'd imagine this has performance implications for red-black trees under pgi, but if they're not going to be used much anymore it's probably not a big deal.
-
Elliot Ronaghan authored
pgi fails to compile math.c, reporting that `-INFINITY` in `pt_norm_expected[]` is a "Non-constant" expression. A simplified version of this failure is: ```c #include <math.h> static double inf1, inf2 = INFINITY; // no complaints static double inf3 = INFINITY; // suddenly INFINITY is "Non-constant" int main() { } ``` ```sh PGC-S-0074-Non-constant expression in initializer (t.c: 4) ``` pgi errors on the declaration of inf3, and will compile fine if that line is removed. I've reported this bug to pgi, but in the meantime I just switched to using (DBL_MAX + DBL_MAX) to work around this bug.
-
Jason Evans authored
-
- Jun 07, 2016
-
-
Jason Evans authored
Revert 245ae603 (Support --with-lg-page values larger than actual page size.), because it could cause VM map fragmentation if the kernel grows mmap()ed memory downward. This resolves #391.
-
Elliot Ronaghan authored
Fix mixed decl in the gettimeofday() branch of nstime_update()
-
Jason Evans authored
This avoids bootstrapping issues for configurations that require allocation during tsd initialization. This resolves #390.
-
Jason Evans authored
As a side effect this causes the extent's 'committed' flag to be updated.
-
- Jun 06, 2016
-
-
Jason Evans authored
-
Jason Evans authored
-
Jason Evans authored
Fix a fundamental extent_split_wrapper() bug in an error path. Fix extent_recycle() to deregister unsplittable extents before leaking them. Relax xallocx() test assertions so that unsplittable extents don't cause test failures.
-
Jason Evans authored
Deregister extents before deallocation, so that subsequent reallocation/registration doesn't race with deregistration.
-
Jason Evans authored
Page-align the gap, if any, and add/use extent_dalloc_gap(), which registers the gap extent before deallocation.
-
Jason Evans authored
Now that extents are not multiples of chunksize, it's necessary to track pages rather than chunks.
-
Jason Evans authored
With the removal of subchunk size class infrastructure, there are no large size classes that are guaranteed to be re-expandable in place unless munmap() is disabled. Work around these legitimate failures with rallocx() fallback calls. If there were no test configuration for which the xallocx() calls succeeded, it would be important to override the extent hooks for testing purposes, but by default these tests don't use the rallocx() fallbacks on Linux, so test coverage is still sufficient.
-
Jason Evans authored
-
Jason Evans authored
-
Jason Evans authored
-
Jason Evans authored
This facilitates the application accessing its own extent allocator metadata during hook invocations. This resolves #259.
-
Jason Evans authored
rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).
-
Jason Evans authored
-
Jason Evans authored
-
Jason Evans authored
-
Jason Evans authored
rallocx() for an alignment-constrained request may end up with a smaller-than-worst-case size if in-place reallocation succeeds due to serendipitous alignment. In such cases, sampling may not happen.
-
Jason Evans authored
In the case where prof_alloc_prep() is called with an over-estimate of allocation size, and sampling doesn't end up being triggered, the tctx must be discarded.
-
Jason Evans authored
-