StringBuilder over String

Shovon was working on a task where he needed to traverse 1 lac of data sets and create a string appending some values from each row. But it takes 4-5 minutes just to build a string called “description” by iterating over DataSource.Rows having 1+ lacs rows in it. It shouldn’t take that much time just to iterating because processors are too much faster. So, We (me, Ananta) decided to look into the code.

Shovon implemented it in following manner:

string description = string.empty;
foreach(DataRow row in DataSource.Rows) {
description += string.Format("{0} {1} {2}", row[propA], row[propB], row[propC]);
}

So, what do you think? We are just iterating 1 lac of data and appending some values into a string. And the time increases exponentially with the number of rows.

Is this really should take that much time? My answer was “No”. It should not. Something is going wrong here which is taking this time. Ananta suddenly suggested to use StringBuilder instead of string for description field. So, we tried that in following way:

StringBuilder description = new StringBuilder();
foreach(DataRow row in DataSource.Rows) {
description.Append(string.Format("{0} {1} {2}", row[propA], row[propB], row[propC]));
}
// use as description.ToString()

Basically, instead of concatening strings, using Append method of StringBuilder here. Surprisingly, it didn’t take more than 2-3 seconds to construct that description field iterating over lac data this time. And now the taken time doesn’t increases with the increment of number of data sets.

Explanation:
Lets concatenate some string and see how it works.

string s = '01';
s += '23';
s += '45';
s += '67';

on the first concatenation it makes a new string of length four and copy “01” and “23” into it — four characters are copied. On the second concatenation it makes a new string of length six and copy “0123” and “45” into it — six characters are copied. On the third concatenation, it makes a string of length eight and copy “012345” and “67” into it — eight characters are copied. So far a total of 4 + 6 + 8 = 18 characters have been copied for this eight-character string. Keep going. Keep going till 1 lac of characters. Could you imagine how much data has been copied?

On the hand, StringBuilder doesn’t depend on copy. It actually uses mutable array, internal linked list and other mapping mechanism instead of creating doubled size array and copy. It doesn’t create copy with doubled size until it really necessary. Although the implementation of StringBuilder seems to be changed with versions and it is becoming improved day by day.
Click here if you really want to know how stringbuilder works