Hacking OpenJDK Part 2: Adding Custom Instrumentation to HotSpot’s Memory Pools
In last week’s piece on “Building Your First OpenJDK,” we explored how to set up a development environment for hacking on the JVM and completing a build. Today, we’re going to take the next step: diving deep into the source and adding a feature.
Note that this adventure and the resulting patch are offered as an illustration of how HotSpot and the JDK’s memory management components fit together and how internal statistics can be floated up to JMX – not as a proposed patch to be integrated into the OpenJDK project itself.
Understanding a Heap’s Workload in Development – and Production
When developing software for high throughput and low response times, it’s important to consider, analyze, and profile one’s memory access patterns. Thorough local profiling combined with continuous, low-overhead runtime instrumentation are important components of an operational strategy for high-performance software. It’s worth considering how much of this profiling could be made more efficient through smarter integration with the VM itself.
Metrics Under the Hood
HotSpot and the JDK expose many internal health metrics. Tools like Coda Hale’s Metrics library can even collect this data and send it to Graphite for easy visualization and dataplay. But can we get better insight by peeking under the hood of the JVM? Absolutely.
The Concurrent Mark-Sweep collector has earned its title as the gold-standard implementation of a garbage collector among managed VMs designed for object-oriented languages. We refer to the CMS collector as “ergonomic” in that it is an adaptive collector sympathetic to the allocation and deallocation workload of the heaps it is responsibile for collecting. This means that CMS maintains statistics about memory allocation rates enabling it to spend a minimal but safe amount of time performing collections in order to satisfy the forecasted allocation demands of the program accessing the heap.
Concurrent Mark-Sweep’s Internal Metrics
Here are some of the statistics the CMS collector maintains internally to forecast and schedule collections [source]:
// Abbreviated from concurrentMarkSweepGeneration.hpp:
// Basic statistics; units are seconds or bytes.
double gc0_period() const { return _gc0_period; }
double gc0_duration() const { return _gc0_duration; }
size_t gc0_promoted() const { return _gc0_promoted; }
double cms_period() const { return _cms_period; }
double cms_duration() const { return _cms_duration; }
double cms_duration_per_mb() const { return _cms_duration_per_mb; }
size_t cms_allocated() const { return _cms_allocated; }
size_t cms_used_at_gc0_end() const { return _cms_used_at_gc0_end;}
// Higher level statistics--
double cms_time_since_begin() const;
double cms_time_since_end() const;
double promotion_rate() const;
double cms_allocation_rate() const;
double cms_consumption_rate() const;
double time_until_cms_gen_full() const;
double time_until_cms_start() const;
unsigned int icms_duty_cycle() const { return _icms_duty_cycle; }
unsigned int icms_update_duty_cycle();
These statistics are key to helping CMS tune its collection intervals – but they could also make useful statistics to continuously export to a graphing engine to better understand runtime operation of a program. Since they’re already maintained by the collector, gathering them is essentially free. Beyond that, reporting should just be a matter of fetching this metadata from the VM via a public interface in JMX.
Let’s see what it might take to enable the VM to export a per-heap “allocation rate” metric, using the CMS’s cms_allocation_rate() metric as a jumping-off point.
Put simply, here is our objective: To create a mechanism by which underlying heaps can report information about their allocation rates up to the JMX management interface.
Breakdown by Component
Despite being a multi-million line project, HotSpot and the JDK’s codebase is exceptionally well-structured, heavily commented, and easy to follow for the determined uninitiated. In order to accomplish our goal – percolating collector stats deep from CMS’s implementation in C++ all the way up to Java’s JMX management interface – we’ll need to interact with a few moving parts in HotSpot. Before we dive in, it’s important that we understand what we’ll be touching. Here’s a “who’s-who” of the components we’ll be working with:
Classes in HotSpot:
concurrentMarkSweepGeneration.cpp/hpp: The definition and implementation of the CMS garbage collector itself. At a hair under 10,000 lines the CMS collector can be daunting to consider, but its interfaces are well-defined and its functionality is well-encapsulated.
memoryService.cpp/hpp: HotSpot’s “MemoryService” is responsible for exposing monitoring and management support to the rest of the VM. It includes methods responsible for adding memory pools to each of the collectors (ParNew, CMS, G1, etc), manages access to a list of active “Memory Managers,” and records the results of “gc_begin” and “gc_end” events emitted by the underlying collectors.
memoryManager.cpp/hpp: “Memory Managers” in HotSpot are in charge of underlying memory pools. Most production JVMs have multiple memory managers – often one for each collector in use (e.g., ParNew and CMS).
management.cpp/hpp: The primary VM management interface between HotSpot and the JDK. This class acts as an API to underlying VM statistics about the active garbage collectors, memory pools, thread information, VM creation time, and so on. Most VM metrics exposed in JMX are exposed via this class.
jmm.h: JMM provides an interface responsible for exposing the underlying C++ VM management methods to JNI via C. On the other side of the bridge, the JDK exposes these JNI methods to the Java language.
Classes in the JDK:
MemoryManagerMXBean.java: The “Memory Manager” MXBean that provides the JMX interface to VM memory management functions.
MemoryManagerImpl.java: The Java implementation of the MemoryManager MXBean interface. In our case, the change is adding a stub ‘native’ method to call down to JNI, where we’ve implemented our bridge to HotSpot’s “jmm” interface (described above).
MemoryManagerImpl.c: The JNI implementation of the native method queried by the MemoryManager MBean interface, responsible for calling the internal HotSpot APIs exposed via JMM.
jmm.h and mapfile-vers: The JDK’s corresponding JMM API, which must align to the JMM interface specified in HotSpot.
Exposing Per-Heap Allocation Rates via JMX: Strategy
Let’s return to our original goal: to create a mechanism by which underlying heaps can report information about their allocation rates up to the JMX management interface. Since we’d like to enable many memory pools to report this information, we’d do well to look for where all collectors report their allocation statistics. In this case, our answer is MemoryService.
How does HotSpot record GC Stats?
MemoryService defines a class called TraceMemoryManagerStats. The contract for recording GC stats in HotSpot is a unique one that makes creative use of stack allocation and destructors in C++. In short, reporting GC stats works like this [follow along]:
- A particular collector invokes a collection.
- The collector stack-allocates a
TraceMemoryManagerStatsobject (or in CMS’s case, aTraceCMSMemoryManagerStatsobject, which inherits from TMMS). TMMS’s constructor contains a call toMemoryService::gc_begin(), which records the start time and information about the collection. - The collector performs its collection activity.
- Once the collection is complete, destructors for stack-allocated objects are called in the reverse of the order in which they were allocated. This means that the
TraceMemoryManagerStatsdestructor is the last method called when the collection finishes. - The
TraceMemoryManagerStatsdestructor callsMemoryService::gc_end(), which marks the completion of the collection cycle and finishes recording stats about the collection that just occurred.
In this manner, TraceMemoryManagerStats acts something like an event or signal flare for each collector. Though the collectors do not interact via explicit references to MemoryService, they report stats to it by initializing and destroying this object. It’s a creative use of C++ that makes for a clean interface for stats reporting without requiring explicit begin/end calls with each invocation.
Our Strategy
With this in mind, we can devise an approach for reporting per-heap allocation rates up to MemoryService:
- Modify the
TraceMemoryManagerStatsobject to accept an optionalallocationRateparameter. - Initialize the
TraceCMSMemoryManagerStatsobject in the CMS collector with the allocation rate supplied. - Add appropriate getters and setters on the
MemoryManagerinterface to set and retrieve this rate as reported following each collection. - Expose an interface via JMM to bridge the gap from JNI to HotSpot’s internal management class, which provides access to MemoryManagers for existing pool reporting stats.
- Add a
getAllocationRate()method to theMemoryManagerMXBean, which provides access to per-MemoryManager stats via JMX.
Seems straightforward enough! Now, let’s have a look at the patch itself. Our patch is presented in two parts, as HotSpot and the JDK exist in separate Mercurial forest repositories. We’ll look at the patch to HotSpot first, then the patch to the JDK.
HotSpot Patch
I’ve broken the HotSpot patch out of this post for brevity. As you read it, keep in mind the strategy we’ve outlined above to help follow it. The patch is right here: https://gist.github.com/cscotta/ede22a2ccd5e5d797371
JDK Patch
And here’s the patch applied to the JDK, exposing these metrics to the Java programming language. This patch is very brief, illustrating how straightforward exposing existing functionality in HotSpot to the higher-level JDK can be. It’s located here: https://gist.github.com/cscotta/f97e59e70faba91aeaab
Taking it for a Spin
With these patches applied and our JDK + HotSpot build performed according to the steps in our first post, it’s time to take it for a spin! One quick way to verify that our new metric is exposed is to launch the `jconsole` we’ve built along with OpenJDK and point it at itself.

Here, we can see that the allocation rate for our running jconsole session is being reported by the CMS collector to the MemoryService, stored as metadata on the CMS’s MemoryManager, and fetched on demand by jconsole through JMX, JNI, JMM, and finally, from the CMS MemoryManager bearing this metadata. Our custom build of the JVM and JDK are doing the right thing.
Up Next
In the next post, we’ll zoom out a bit and consider this project as a whole – why would anyone want to fork a VM, why JDK/HotSpot in particular, and investigate several other opportunities for additional instrumentation features that could be added. As an exercise in the meantime, you may find it helpful to skim the classes described above in the “who’s-who” section – or at least their header files – to get a better sense of the separation of responsibility for memory management, instrumentation, and VM management functions.
Until then, stay tuned!
Thanks to Dietrich Featherston, Ben Linsay, and others for reading drafts and offering feedback on early versions of this article.





