In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim memory that was allocated by the program, but is no longer referenced; such memory is called garbage. Garbage collection was invented by American computer scientist John McCarthy around 1959 to simplify manual memory management in Lisp.
This series of articles discuss the Garbage Collection in .NET.
This article describes the core concepts of garbage collection: the process of the memory management of .NET for managed code,. The
part (2) of this article will discuss how to manage the unmanaged code in .NET, the Dispose Pattern.
A - Introduction
Managed code is the code which is managed by the CLR (Common Language Runtime) in .NET Framework, whereas the Unmanaged code is the code which is directly executed by the operating system.
Unlike in C++ where both the memory allocation and release are manually handled by developers, in the common language runtime (CLR), the garbage collector (GC) serves as an automatic memory manager, doing the same job.
For developers working with managed code, this means that you don't have to write code to perform memory management tasks. Automatic memory management can eliminate common problems, such as forgetting to free an object and causing a memory leak or attempting to access memory for an object that's already been freed.
The content of this article:
- A - Introduction
- B - Management and Garbage Collection
- C - Collecting the Garbage
- D - Conclusion
B - Management and Garbage Collection
Garbage Collection: The automatic memory management scheme by the .NET, i.e., a background mechanism to clean up unreferenced heap memory.
- When allocating memory for a new object on the heap without sufficient free memory, the .NET Framework starts the garbage collection process:
- visiting all the objects in the heap and marks those objects pointed to by any variable.
- releasing the unmarked objects (not unreachable by any variables).
- The process is nondeterministic.
- It can be enforced by GC.Collect(), at Dispose()
- The system has low physical memory.
- The memory that is used by allocated objects on the managed heap surpasses an acceptable threshold. This threshold is continuously adjusted as the process runs.
- The GC.Collect method is called.
Generations (MSDN):
The heap is organized into generations so it can handle long-lived and short-lived objects. Garbage collection primarily occurs with the reclamation of short-lived objects that typically occupy only a small part of the heap. There are three generations of objects on the heap:
- Generation 0: the youngest generation and contains short-lived objects. Garbage collection occurs most frequently in this generation.
- Newly allocated objects form a new generation of objects and are implicitly generation 0 collections, unless they are large objects, in which case they go on the large object heap in a generation 2 collection.
- Most objects are reclaimed for garbage collection in generation 0 and do not survive to the next generation.
- Generation 1: contains short-lived objects and serves as a buffer between short-lived objects and long-lived objects.
- Generation 2: contains long-lived objects. An example of a long-lived object is an object in a server application that contains static data that is live for the duration of the process.
Figure 1. Simplified model of the managed heap (this is
not what is actually implemented.)
The rules for this simplified model,
- All garbage-collectible objects are allocated from one contiguous range of address space.
- The heap is divided into generations so that it is possible to eliminate most of the garbage by looking at only a small fraction of the heap.
- Objects within a generation are all roughly the same age.
- Higher-numbered generations indicate areas of the heap with older objects—those objects are much more likely to be stable.
- The oldest objects are at the lowest addresses, while new objects are created at increasing addresses. (Addresses are increasing going down in Figure 1 above.)
- The allocation pointer for new objects marks the boundary between the used (allocated) and unused (free) areas of memory.
- Periodically the heap is compacted by removing dead objects and sliding the live objects up toward the low-address end of the heap. This expands the unused area at the bottom of the diagram in which new objects are created.
- The order of objects in memory remains the order in which they were created.
- There are never any gaps between objects in the heap.
- Only some of the free space is committed. When necessary, more memory is acquired from the operating system in the reserved address range.
C - Collecting the Garbage
The easiest kind of collection to understand is the fully compacting garbage collection, this graph will demostrate the process:
Figure 2. Illustrated garbage collection process.
This article describes the core concepts of garbage collection, the part (2) of this article will discuss How to manage the unmanaged code in .NET, the Dispose Pattern.
Related:
1, What is the weak reference?
WeakReference Class (MSDN):
Represents a weak reference, which references an object while still allowing that object to be reclaimed by garbage collection.
For a large object, we use the weak reference. After using it by the strong references, we destroy the strong references, but still keep the weak reference in case we need the large object again. Therefore, the GC will collect the memory if the memory is low while GC will not collect the memory if the memory is not low.
Also see: C# Tutorial - Weak References
2, How does .net managed memory handle value types inside objects? [ref]
Value-type values have to live together with the object instance in the managed heap. The thread's stack for a method only lives for the duration of a method; how can the value persist if it only exists within that stack?
A class' object size in the managed heap is the sum of its value-type fields, reference-type pointers, and additional CLR overhead variables like the Sync block index. When one assigns a value to an object's value-type field, the CLR copies the value to the space allocated within the object for that particular field.
References: