Skip to content
  1. Jan 25, 2019
  2. Jan 24, 2019
  3. Jan 16, 2019
  4. Jan 14, 2019
  5. Jan 12, 2019
  6. Jan 11, 2019
  7. Jan 09, 2019
  8. Jan 08, 2019
  9. Dec 19, 2018
  10. Dec 18, 2018
  11. Dec 08, 2018
  12. Dec 04, 2018
  13. Nov 29, 2018
    • Dave Watson's avatar
      mutex: fix trylock spin wait contention · b23336af
      Dave Watson authored
      If there are 3 or more threads spin-waiting on the same mutex,
      there will be excessive exclusive cacheline contention because
      pthread_trylock() immediately tries to CAS in a new value, instead
      of first checking if the lock is locked.
      
      This diff adds a 'locked' hint flag, and we will only spin wait
      without trylock()ing while set.  I don't know of any other portable
      way to get the same behavior as pthread_mutex_lock().
      
      This is pretty easy to test via ttest, e.g.
      
      ./ttest1 500 3 10000 1 100
      
      Throughput is nearly 3x as fast.
      
      This blames to the mutex profiling changes, however, we almost never
      have 3 or more threads contending in properly configured production
      workloads, but still worth fixing.
      b23336af
  14. Nov 16, 2018
  15. Nov 14, 2018
  16. Nov 13, 2018
  17. Nov 12, 2018
    • Dave Watson's avatar
      Add a free() and sdallocx(where flags=0) fastpath · 794e29c0
      Dave Watson authored
      Add unsized and sized deallocation fastpaths.  Similar to the malloc()
      fastpath, this removes all frame manipulation for the majority of
      free() calls.  The performance advantages here are less than that
      of the malloc() fastpath, but from prod tests seems to still be half
      a percent or so of improvement.
      
      Stats and sampling a both supported (sdallocx needs a sampling check,
      for rtree lookups slab will only be set for unsampled objects).
      
      We don't support flush, any flush requests go to the slowpath.
      794e29c0
    • Dave Watson's avatar
      refactor tcache_dalloc_small · e2ab2153
      Dave Watson authored
      Add a cache_bin_dalloc_easy (to match the alloc_easy function),
      and use it in tcache_dalloc_small.  It will also be used in the
      new free fastpath.
      e2ab2153
    • Dave Watson's avatar
      rtree: add rtree_szind_slab_read_fast · 5e795297
      Dave Watson authored
      For a free fastpath, we want something that will not make additional
      calls.  Assume most free() calls will hit the L1 cache, and use
      a custom rtree function for this.
      
      Additionally, roll the ptr=NULL check in to the rtree cache check.
      5e795297
  18. Nov 09, 2018