Small changes in our code can really make a huge difference to performance. There are many tips and tricks available and among those, one I will discuss here. I'll be talking about String vs StringBuilder. One needs to be very careful when playing with strings because memory-wise there is a huge impact of strings. I know, there are many articles available from the internet about String and StringBuilder, but still I will show this, using some statistics.
Here I am taking a Fx 4.0 C# console application with various static methods to show my analysis.
String class
Basically what I am doing here is, I have a String variable named outputString and just looping that for 1000 times and concating the string to the variable outputString.
- namespace XXX
- {
- class Program
- {
- static void Main(string[] args)
- {
- GetConcatedString();
- }
- private static string GetConcatedString()
- {
- string outputString = String.Empty;
- for (int i = 0; i < 1000; i++)
- {
- outputString += i;
- }
- return outputString;
- }
- }
- }
Please note, concatenation is done using the "+" operator. So, what happens internally is, whenever concatenation is done using the "+" operator? Each time, a new String object is created, as in my snippet. Here I am looping 1000 times, so, it is creating 1000 String objects and every time is is replaced with the variable outputString. That way, whenver we use string concatenation with the "+" operator, it will definitely degrade application performance.
Well, I guess this much boring theory is enough. Let's move towards the statistics.
Here I am using the CLR Profiler, one of the really good tools to analyse our code performance. This tool tells us how much memory is consumed, the performance of the Garbage Collector and how many objects it is moving to generation Gen0, Gen1 and Gen2 buckets. And at the same time statistics provided by this tool is very easy to understand.
Next I ran the CLR Profiler for the code above and got the following statistics. Here I am not going to cover GC generations in detail, but would like to touch a bit on it. One must know that for all the objects created in an application, the first comes to the G0 bucket and then older objects are moved to the G1 bucket. If the G1 bucket will be full then the older objects are moved to the G2 bucket. But for .Net GC, the frequency of visiting G1 and G2 is reduced, compared to the G0 bucket. In other words, that GC is visiting bucket 0 frequently, so it is releasing G0 objects very frequently and the scope of the object is also reduced. So, if your application is creating objects that many objects are moving to G1 and G2, then it is not a good sign.
Now quickly return to our example:
Analysis of the preceding results
Here we see that heap bytes are present in all three, Gen 0, Gen 1, Gen 2 and even memory-wise also it is 7 digits (2,894,353).
Here Relocated bytes means it will be part of G1 related objects. Here I am not going to analyse all the result, but somehow we are seeing here some negative signs because a few of the objects are falling in the G1 and G2 buckets also.
Now before commenting on it, let's take the StringBuilder's data.
StringBuilder class
In this example, I just created a StringBuilder instance named sb. Here I am doing the same thing, but instead of the String class, I am taking an instance of the StringBuilder class. In case of StringBuilder, whenever a value is appended, it will not create any new object but just updates the reference of the sb object with the new value. So, internally it is not creating a new object for every concatination. So, this is the real benefit of StringBuilder compared to the String object. Let's have a look at the code:
- class Program
- {
- static void Main(string[] args)
- {
- GetConcatedStringBuilderString();
- }
- private static string GetConcatedStringBuilderString()
- {
- StringBuilder sb = new StringBuilder();
- for (int i = 0; i < 1000; i++)
- {
- sb.Append(i);
- }
- return sb.ToString();
- }
- }
Although we are looping for 1000 times, but it doesn't mean that we are creating 1000 string objects. That's the way we are controlling memory usage and creation of new objects. We will now run the profiler and analyze the results. Let's check what the statistics say:
Analysis of the preceding results
Here we see that memory bytes are reduced to 5 digits (92,332) and the relocated bytes are nothing. If we look at the Heap bytes, it is unknown (0) for all G0, G1 and G2. In other words, none of the objects are moved to G1 and G2. All the objects are created in G0 and released from G0 itself.
So, here we see that there is a significant difference in both memory usage as well as GC's bucket movements. Hence, we can conclude that we should prefer to use StringBuilder, rather than String, especially when we are dealing with concatenations.