Most developers know that strings in .Net are immutable. However many do not understand the reason for that behavior. I will try to explain that in this article.
Before diving into the reason, let me first explain what we mean by immutable.
What does "Immutable strings" mean?
The dictionary meaning of immutable is “unchanging over time or unable to be changed”. This means once a value is assigned to a String object, it can never be changed. Yes, you read that correctly. Consider the following code:
The following is the output of this code.
abcdef
abcdefghijkl
Though it seems as if we just changed the value of myString from “ABC” to “abcdef” and then to “abcdefghijkl”, but we really didn't! Let's try to understand it. In the first step, a new string object is allocated on the heap with the value of “abc” and myString points to this memory location. At Step 2 (myString += “def”;), a new string object is allocated on the heap with the value of “abcdef” and myString now points to this new memory location. But the string “abc” still exists on the heap. So we actually sit with two string objects on the heap, even though we’re only referencing one of them. Continuing in the same way, at the end of this code we will have four string objects, with only one object referenced and the other three unused. The following memory allocation diagram of the preceding code will make things more clear.
Now we will move on to the context of why.
Why strings are immutable in .Net?
Designers of .Net decided to implement immutable text strings. They have multiple reasons for this architecture. If programmers have multiple string variables with the same value then it will avoid allocating memory for the same string value multiple times. It will allocate memory to a string once and all the variables will point to the same memory block. Consider the following block of code.
The memory allocation for this code will look like this.
If strings were mutable, changing the value of str1 would have changed the value of str2 and str3 also but that is unwanted.
Second, immutable strings eliminate race conditions in multi-threaded applications. Any text amendment causes the creation of a new variable so there is no need to set up the lock to avoid conflicts while multiple threads simultaneously access the text. In some cases, those race conditions could be used to mount security attacks. For example, you could satisfy a FileIOPermission demand with a string pointing to a publicly accessible section of the file system and then use another thread to quickly change the string to point to a sensitive file before the underlying CreateFile occurs.
Another reason for string immutability is the well-adapted use of strings as keys in hashtables. The objects on which the hash values are computed must be immutable to ensure that the hash values will be constant in time.
Another cool thing about string immutability is that even though System. String is a class, string objects are compared with equivalence, like as a value type. This is possible because we can consider that the identity of an immutable object is its state. Consider the following piece of code.
Even though str1 and str2 reference 2 different objects, the preceding code returns true.
- StringBuilder: An alternative to avoid the creation of unused strings
As we saw in the figure “Memory Allocation of immutable strings”, there are unused strings allocated in memory. It's because of the way string behaves. If a code makes thousands of operations on a string, the heap will have thousands of unused string objects leading to unwanted wasted memory. Fortunately, we can avoid this using the StringBuilder class. In my next article, I will explain this class.