Psychosomatic, Lobotomy, Saw: Where is my safepoint?

My new job (at Azul Systems) leads me to look at JIT compiler generated assembly quite a bit. I enjoy it despite, or perhaps because, the amount of time I spend scratching my increasingly balding cranium in search of meaning. On one of these exploratory rummages I found a nicely annotated line in the Zing (the Azul JVM) generated assembly:

gs:cmp4i [0x40 tls._please_self_suspend],0
jnz 0x500a0186

Zing is such a lady of a JVM, always minding her Ps and Qs! But why is self suspending a good thing?

Safepoints and Checkpoints

There are a few posts out there on what is a safepoint (here's a nice one going into when it happens, and here is a long quote from Mechnical Sympthy mailing list on the topic). Here's the HotSpot glossary entry:

safepoint

A point during program execution at which all GC roots are known and all heap object contents are consistent. From a global point of view, all threads must block at a safepoint before the GC can run. (As a special case, threads running JNI code can continue to run, because they use only handles. During a safepoint they must block instead of loading the contents of the handle.) From a local point of view, a safepoint is a distinguished point in a block of code where the executing thread may block for the GC. Most call sites qualify as safepoints. There are strong invariants which hold true at every safepoint, which may be disregarded at non-safepoints.

To summarize, a safepoint is a known state of the JVM. Many operations the JVM needs to do happen only at safepoints. The OpenJDK safepoints are global, while Zing has a thread level safepoint called a checkpoint. The thing about them is that at a safepoint/checkpoint your code must volunteer to be suspended to allow the JVM to capitalize on this known state.

What will happen while you get suspended varies. Objects may move in memory, classes may get unloaded, code will be optimized or deoptimized, biased locks will unbias.... or maybe your JVM will just chill for a bit and catch its breath. At some point you'll get your CPU back and get on with whatever you were doing.

This will not happen often, but it can happen which is why the JVM makes sure you are never too far from a safepoint and voluntary suspension. The above instruction from Zing's generated assembly of my code is simply that check. This is called safepoint polling.
The safepoint polling mechanism for Zing is comparing a thread local flag with 0. The comparison is harmless as long as the checkpoint flag is 0, but if the flag is set to 1 it will trigger a checkpoint call (the JNZ following the CMP4i will take us there) for the particular thread. This is key to Zing's pause-less GC algorithm as application threads are allowed to operate independently.

Reader Safpoint

Having happily grokked all of the above I went looking for the OpenJDK safepoint.

Oracle/OpenJDK Safepoints

I was hoping for something equally polite in the assembly output from Oracle, but no such luck. Beautifully annotated though the Oracle assembly output is when it comes to your code, it maintains some opaqueness when it's internals are concerned. After some digging I found this:

test DWORD PTR [rip+0xa2b0966],eax # 0x00007fd7f7327000
; {poll}

No 'please', but still a safepoint poll. The OpenJDK mechanism for safepoint polling is by accessing a page that is protected when requiring suspension at a safepoint, and unprotected otherwise. Accessing a

protected page will cause a SEGV (think exception) which the JVM will handle (nice explanation here). To quote from the excellent Alexey Ragozin blog:

Safepoint status check itself is implemented in very cunning way. Normal memory variable check would require expensive memory barriers. Though, safepoint check is implemented as memory reads a barrier. Then safepoint is required, JVM unmaps page with that address provoking page fault on application thread (which is handled by JVM’s handler). This way, HotSpot maintains its JITed code CPU pipeline friendly, yet ensures correct memory semantic (page unmap is forcing memory barrier to processing cores).

The [rip+0xa2b0966] addressing is a way to save on space when storing the page address in the assembly code. The address commented on the right is the actual page address, and is equal to the rip (Relative Instruction Pointer) + given constant. This saves space as the constant is much smaller than the full address representation. I thank Mr. Tene for clarifying that one up for me.

If we were to look at safepoint polls throughout the assembly of the same process they would all follow the above pattern of pointing at the same global magic address (via this local relative trick). Setting the magic page to protected will trigger the SEGV for ALL threads. Note that the Time To Safe Point (TTSP) is not reported as GC time and may prove a hidden performance killer for your application. The effective cost of this global safepoint approach goes up the more runnable (and scheduled) threads your application has (all threads must wait for a safepoint consensus before the operation to be carried out at the safepoint can start).

Find The Safpoint Summary

In short, when looking for safepoints in Oracle/OpenJDK assembly search for poll. When looking at Zing assembly search for _please_self_suspend.

4 comments:

Unknown26 Mar 2014, 22:59:00
Another great blog post Nitsan.

It's worth mentioning that if you use "-XX:+PrintGCApplicationStoppedTime" you can get pause times including the TTSP in your GC Logs. Its not just GC pause times though it includes all hotspot pauses, for example bulk lock inflation/deflation. On the other hand if you care about pauses you probably don't care about just GC pauses.
Unknown29 Apr 2015, 17:34:00
Good explanation! But why is Zing using the cmp & jnz-pair then? Afaict, the OpenJDK:s safepoint instruction is both faster and shorter.

Note: only a member of this blog may post a comment.

Wednesday 26 March 2014

Where is my safepoint?

Safepoints and Checkpoints

Reader Safpoint

Oracle/OpenJDK Safepoints

Find The Safpoint Summary

4 comments: