This is a continuation of my first blog on CORE CLR CIL to Execution. In this blog, we will go through the topic of Garbage Collection and Memory management by the CLR.
In CLR Garbage Collector, hereafter we are going to mention that GC plays the role of a memory manager by managing the allocation and release of memory for an application. Let’s understand a term called Virtual Address Space before we jump into GC & Memory Management.
What is Virtual Address Space?
This is a range of addresses that an application can use as memory. This is an abstract layer over physical memory which helps the application not to deal with physical memory directly. Each process has its own isolated virtual address space. The operating system uses a structure called page tables to map the virtual address to the Physical memory address. Each Virtual address will have a page number and an offset. The page Number is used to look up the physical address in the page table, and the offset specifies the exact location within the page table.
Cool, right Now, let’s jump into memory management by CLR.
Allocation
When we trigger a process, the CLR will reserve a contiguous region of virtual address space, which is known as a Managed Heap, and also creates a pointer that keeps track of the next available address, which is called the Next Object Pointer or Allocation Context. Initially, this will point to the start of the allocated address space, and the creation of a new object will be incremented in such a way that it will point to the next available memory address. On Subsequent object creation, CLR uses this pointer to identify the free space, uses this space, and increments the pointer. The key advantage of using a managed heap with a pointer is the performance in allocating the memory compared to an unmanaged heap.
Collection/Release
The Garbage Collector (GC) associated with CLR is responsible for releasing the memory occupied by an object of our code. GC identifies the objects that are no longer in use from the application Roots. Roots are the key starting point that helps GC to identify the objects which are in use and not in use. There are different roots that dotnet uses.
Stack Root: These come from local variables and method parameters from the stack of each thread, suppose we have a method with a local variable, then that variable acts as a root while running the method.
Global & Static Variable
- Finalization Queue: Objects with a finalizer are maintained in a separate Queue, these objects are considered as root until the finalizer is called.
- GC Handles: Special references that are used by runtime to manage object life cycle (I will discuss this in detail later)
After getting the roots, the GC creates a graph of objects that are in reach by the Root & the objects that are not in the graph are unreachable objects which need to be released. Now, let’s see GC in action.
Step 1. Using the generated graph, GC will mark all the reachable objects.
Step 2. GC will reclaim the memory of unreachable objects.
Step 3. GC compacts the heap by moving the reachable objects to a contiguous block of memory by copying them one by one to the new memory location and then resetting the next object pointer appropriately.
Obj is the unreachable object.
Now let’s understand a bit about the algorithm used by GC. GC always collects short-lived objects firsthand. To optimize the process of collection, the managed heap is divided into three generations. Gen 0, 1 & 2, using which it can determine short-lived & long-lived objects. When a new object is created, it will be stored in Gen 0. GC will frequently run in this area of Heap. After performing collection Gen 0 the GC will compact the memory for the reachable objects and promote them to Gen1. While creating the new object, if GC finds out the memory is full, then it will perform a collection Gen 0. If GC is not able to retrieve the required memory by collection of Gen 0, then it will start analyzing the Gen1 and start to collect the Gen1 memory and promote the surviving objects to Gen2.
There is an exception for objects which are more than 85 kb. While creating they will not be stored in Gen0 instead they will be moved to Large Object Heap (Gen3) directly. Collection of LOH will happen with Gen2 but there will not be any compaction done in LOH as copying is a time-consuming operation.
Note. There is a property property available in the GC setting to set this compaction mode of LOH on demand.