Introduction
We all use arrays in C# and other programming languages. Arrays have some limitations on design. First, arrays are homogeneous, in other words you can store only one type of element. Secondly, when using arrays you must specifically allocate a certain number of elements. Often developers want something more flexible, especially for uncertainty in the size of the collection. The .NET Framework Base Class Library provides such a data structure called ArrayList located in the System.Collections namespace.
ArrayList is nothing but a Heterogeneous and Self Re-Dimensioning Array
Elements of various types can be added to the ArrayList. Further, we do not need to concern ourselves with redimensioning the ArrayList. All of this is handled automatically for us. An example of the ArrayList in action can be seen in the code snippet below.
Behind The Scenes
Behind the scenes the ArrayList uses a System.Array of type Object. An object array can hold elements of any type since all types are derived from Object (either directly or indirectly). By default the size of this array is 16, although it can be defined in a constructor or by setting the capacity property. Elements can be added to ArrayList using the Add() Method. Behind the scenes, the Add() method first compares the number of elements in array with its capacity. If adding the new element causes the count to exceed the capacity, the array is redimensioned and the capacity is automatically doubled.
Performance
ArrayList provides some additional flexibility compared to arrays but this flexibility comes at the cost of performance, especially if you store value types. The ArrayList's internal array is of object type, so every value type is boxed and stored on the heap and each ArrayList element is a reference to a boxed value type. When you access a value type element it is unboxed before you can use it.
The boxing and unboxing, along with the extra level of indirection that comes with using value types in an ArrayList, can hamper the performance of your application when using large ArrayLists with many reads and writes.
The preceding diagram shows the memory allocation for ArrayList.
The self-redimensioning of ArrayList should not cause a performance degradation compared to an array. Because you can turn off self-redimensioning by specifying the initial capacity in the constructor. If you don't know the exact size, you may need to re-size even with the array also when the number of elements inserted increases the size of array.
Memory Allocation on Redimensioning
Why does the size of an ArrayList double when it is redimensioned? Its a classic computer science problem to determine how much extra memory should be allocated when running out of space in some buffer.
One option is to allocate just one more element in the array when redimensioning. In other words, if the initial size of an array was 10 and when adding the 11th element, resize the array to 11. This approach conserves most of the memory but becomes very costly since redimensioning is required upon insertion of every additional element.
The second option is to redimension the array 100 or 200 times larger than the current size. In other words, if the array is initially allocated to have 10 elements then before inserting the 11th element resize it to 1000 elements. This approach greatly reduces the redimensioning overhead, but if only a few more elements needs to be added, the extra allocated space is wasted.
So after trying various options, the true compromise is to just double the size of the array when it becomes exhausted. This is the precise approach that ArrayList takes and its all done automatically for us.
Summary
- ArrayList internally uses an array of object type.
- ArrayList provides more flexibility than a simple array.
- The precise size of an ArrayList can be set in the constructor or by the capacity property. By default it's 16.
- When adding, if the number of elements in the array exceeds its capacity, the array is redimensioned to double of its current size.
- Boxing and Unboxing of value types degrades the performance of the arraylist.