Add per thread allocation counters, and enhance heap sampling.
Add the "thread.allocated" and "thread.deallocated" mallctls, which can be used to query the total number of bytes ever allocated/deallocated by the calling thread. Add s2u() and sa2u(), which can be used to compute the usable size that will result from an allocation request of a particular size/alignment. Re-factor ipalloc() to use sa2u(). Enhance the heap profiler to trigger samples based on usable size, rather than request size. This has a subtle, but important, impact on the accuracy of heap sampling. For example, previous to this change, 16- and 17-byte objects were sampled at nearly the same rate, but 17-byte objects actually consume 32 bytes each. Therefore it was possible for the sample to be somewhat skewed compared to actual memory usage of the allocated objects.
Please register or sign in to comment