Internal structure of the Garbage Collector: Go ↔ Java
In this article, we will thoroughly examine the work of the garbage collector (Garbage Collector, GC) in Go and Java, discuss key internal mechanisms: concurrent mark & sweep, mutator vs collector, tricolor marking, GC pacing, root set scanning, and stack scanning. For Java developers, this will help understand the Go approach, while Go developers can dive into the details of JVM GC.
Concurrent Mark & Sweep
This is a garbage collection method where objects are marked and cleaned up while the program continues to run.
Go: concurrent mark & sweep
func main() {
// Create objects that will be cleaned up by GC
objs := make([]*int, 0)
for i := 0; i < 1000000; i++ {
val := i
objs = append(objs, &val) // objects are created in the heap
}
fmt.Println(len(objs))
}
// In Go, GC runs concurrently with the program, minimizing pauses
Java: concurrent mark & sweep
import java.util.ArrayList;
public class Main {
public static void main(String[] args) {
ArrayList<Integer> objs = new ArrayList<>();
for (int i = 0; i < 1000000; i++) {
objs.add(i); // objects are created in the heap
}
System.out.println(objs.size());
}
}
// In Java, CMS (Concurrent Mark & Sweep) also tries to minimize pauses, GC runs concurrently with the mutator
Mutator vs Collector
Mutator — this is the application code that creates and modifies objects. Collector — GC, which frees memory. The separation allows the GC to work in parallel with the program.
Go: mutator vs collector
// mutator: the program creates objects
a := make([]int, 1000)
// collector: GC cleans up unnecessary objects in the background
runtime.GC() // force garbage collection
Java: mutator vs collector
// mutator: the program creates objects
int[] arr = new int[1000];
// collector: GC frees memory automatically
System.gc(); // force GC
Tricolor Marking
The technique of tricolor marking is used to track the state of objects: white — candidates for deletion, gray — scanned, black — reachable.
Go: tricolor marking
// GC marks objects by colors (under the hood of Go)
type Object struct { Value int }
objs := []*Object{}
for i := 0; i < 100; i++ {
objs = append(objs, &Object{Value: i})
}
// white objects can be deleted, gray ones are scanned, black ones are reachable
Java: tricolor marking
class ObjectNode { int value; }
ArrayList<ObjectNode> objs = new ArrayList<>();
for (int i = 0; i < 100; i++) {
objs.add(new ObjectNode());
}
// JVM GC uses tricolor marking for concurrent collectors
GC Pacing
GC pacing — is the management of the garbage collector's speed to avoid overloading the application and maintain smooth pauses.
Go: GC pacing
runtime.GOMAXPROCS(4) // specify the number of GC threads
// Go GC automatically adjusts the frequency of collection to reduce pauses
Java: GC pacing
// JVM GC adjusts pauses using parameters
// Example: -XX:MaxGCPauseMillis=200
Root Set Scanning
Root set — these are the objects from which the traversal of reachable objects begins: global variables, thread stacks, and registers.
Go: root set scanning
// GC scans all global objects and goroutine stacks
// to determine which objects are reachable
Java: root set scanning
// JVM scans thread stacks and static fields for roots
Stack Scanning
The stack is scanned to search for local variables that reference objects in the heap to determine their reachability.
Go: stack scanning
// During GC, Go scans the stack of each goroutine
// to determine which objects should remain
Java: stack scanning
// The JVM scans the stack of each thread
// to determine objects that have references
Understanding the internals of GC is important for optimizing performance. The different approaches of Go and Java allow for the development of efficient code: Go minimizes pauses through concurrent mark & sweep and tricolor marking, while Java provides flexible tuning through JVM parameters. It is important to remember that frequently created short-lived objects can significantly burden GC, so it is worth controlling the volume of created objects and using profiling to analyze GC performance.
The practical application of these mechanisms is widespread in high-load systems. In Go, concurrent mark & sweep and tricolor marking allow processing millions of objects in web services, microservices, and real-time applications without significant pauses. Root set and stack scanning are critical for correct memory release when working with goroutines. In Java, CMS and G1 GC are used for server applications, large caches, and enterprise systems, where it is important to maintain a balance between throughput and latency. Drawbacks of Go: difficult to predict escape analysis and object placement in the heap; drawbacks of Java: JVM parameter tuning is critical, errors can lead to pauses of hundreds of milliseconds. Business examples: online stores process millions of requests and sessions, analytical platforms create temporary objects for calculations, games use GC to manage object states. Advantages of Go: minimal pauses, fast GC, thread flexibility; disadvantages: harder to predict memory. Advantages of Java: powerful GC, pause tuning, monitoring; disadvantages: harder to optimize for short-lived objects.
| Term | Go | Java | Comment |
|---|---|---|---|
| Concurrent Mark & Sweep | GC marks and cleans objects parallel to the application | CMS and G1 perform similarly, minimizing pauses | Go uses lightweight concurrent GC; Java offers several types of collectors that can be tuned to tasks. |
| Mutator vs Collector | Mutator creates objects, Collector frees memory in parallel | Same thing, GC runs parallel to the mutator | It is important to understand the interaction to prevent long pauses. In Go, pause minimization is built-in; in Java, it depends on the chosen GC. |
| Tricolor Marking | Objects are white/gray/black for safe concurrent GC | Used in modern concurrent collectors (CMS, G1) | The tricolor scheme helps correctly mark reachable objects, preventing leaks and the removal of needed objects. |
| GC Pacing | Automatic adjustment of GC frequency, minimizing pauses | Tuned through JVM parameters, for example, MaxGCPauseMillis | Allows balancing application throughput and pause latency. Go automates it, Java requires tuning. |
| Root Set Scanning | Scanning global variables and goroutine stacks | Scanning thread stacks and static fields | Identifies objects reachable from roots. Without this, GC cannot correctly identify objects for deletion. |
| Stack Scanning | Scanning the stack of each goroutine | Scanning the stack of each thread | Local variables in the stack may reference objects in the heap; without scanning, they may be incorrectly deleted. |
Output
The comparative analysis shows that Go and Java use similar GC concepts, but Go emphasizes minimizing pauses and automation, while Java offers flexibility in configuring collectors. Understanding concurrent mark & sweep, mutator vs collector, three-color marking, GC pacing, root set and stack scanning is critical for effective memory management. For a Java developer, this is a chance to understand the parallel GC of Go, and for a Go developer — the principles of the JVM. Practical advice: profile applications, keep an eye on short-lived objects, consider the load on the GC, and choose the optimal collector for business tasks.
ASCII diagram of GC internals:
Mutator (application)
│
▼
Root Set & Stack Scanning
│
▼
Tricolor Marking (White/Gray/Black)
│
▼
Heap (long-lived objects)
│
▼
Concurrent Collector (Mark & Sweep / GC)
Оставить комментарий
Useful Articles:
New Articles: