Home About

February 13th, 2009

Manipulating Strings in C# -Replacing part of a string / Replacing all occurences of a sub-string - 2

Very often you need to change part of a string, maybe just once, or many times over. Strings in .NET/C# are immutable we cannot actually change a string in-place. But we are able to work on copies. The code example below attaches two new methods to the C# string class.

  • The ReplaceFirst method replaces the first occurrence of “needle” in a string and replaces it with “replacement”.
  • The ReplaceAll function is similar: it steps through the string modifying it each time it finds “needle” and replaces it. To avoid a possible infinite loop it first checks whether “needle” is equivalent to “replacement”.

using System;
using System.Collections;

namespace StringItems
{
        static class StringExt
        {
                public static string ReplaceFirst(this string haystack, string needle, string replacement)
                {
                        int pos = haystack.IndexOf(needle);
                        if (pos < 0) return haystack;

                        return haystack.Substring(0,pos) + replacement + haystack.Substring(pos+needle.Length);
                }

                public static string ReplaceAll(this string haystack, string needle, string replacement)
                {
                        int pos;
                        // Avoid a possible infinite loop
                        if (needle == replacement) return haystack;
                        while((pos = haystack.IndexOf(needle))>0)
                                haystack = haystack.Substring(0,pos) + replacement + haystack.Substring(pos+needle.Length);
                        return haystack;
                }

        }
}

Both methods are implemented using a class extension. (for more on creating class extensions see also Finding all occurrences of a string within another string) After you include these methods into your project you can call them directly from any string instance:

string myString = “Hello World”;
string myModifiedString = myString.ReplaceFirst(“World”,”People”);
Console.WriteLine(“{0}”,myModifiedString); // Writes: “Hello People”

An example use of the ReplaceAll method:

string myString = “boo foo is not foo boo or foo boo foo”;
string myModifiedString = myString.ReplaceFirst(“boo”,”goo”);
Console.WriteLine(“{0}”,myModifiedString); // Writes: “goo foo is not foo goo or foo goo foo”;

Why not just use a regular expression?

If you are familiar with the RegEx class in C# you can easily write a regular expression to achieve the same string replacement result:

using System.Text.RegularExpressions;
Regex regex = new Regex(“boo”);
string result = regex.Replace(“boo foo is not foo boo or foo boo foo”, “goo”);

Regular expressions are flexible and if you do anything more complex than just a basic string replacement they are your only choice. But they come at a hefty performance price. To run a regular expression it needs to be compiled first and then executed. The .NET runtime caches the expression for performance but using a regular expression for string replacement is still much slower.

How much slower are regular expressions for string replacement?

In an earlier post I described the Stopwatch class in System.Diagnostics. It is ideal for a little benchmark testing — so lets compare my string replacement methods with the build-in regular expression library:

string haystack = "boo foo is not foo boo or foo boo foo";
string result;
Stopwatch sw = Stopwatch.StartNew();
for (int Lp = 0; Lp < 100000; Lp++)
result = regex.Replace($haystack, "goo");
sw.Stop();
Console.WriteLine("Time used (float): {0} ms",sw.Elapsed.TotalMilliseconds);


And the same for the string replacement functions:

string haystack = "boo foo is not foo boo or foo boo foo";
string result;
Stopwatch sw = Stopwatch.StartNew();
for(int Lp = 0; Lp < 100000; Lp++)
result = haystack.ReplaceAll("boo","goo");
sw.Stop();
Console.WriteLine("Time used (float): {0} ms",sw.Elapsed.TotalMilliseconds);

The regular expression code needed 1100ms , whereas the string replacement code needed just 27ms. So for this particular example, the string replacement was 40 times faster than a regular expression.

Be Sociable, Share!

Tags: , ,

2 Responses to “Manipulating Strings in C# -Replacing part of a string / Replacing all occurences of a sub-string”

  1. Mike Says:

    Just wanted to let you know I came across your blog a few weeks ago, and have found your articles very informative. I never knew that the two methods above would be so far apart performance wise.

  2. Martijn Says:

    @Mike

    Thanks for the comment. Regular expressions are notoriously slow because of their complexity but they usually offer very good value as you can work miracles with just a few lines.


Most popular
Recent Comments
  • ARS: great plugin! I love it! but, it will be so nice if you can add attribute ‘title’ as one of...
  • Nelson: Saved me from doing it myself. Good article.
  • andy: i am currently playing taiwanese server wow in 奈辛瓦里(PVP) and i would like to realm transfer to somewhere there...
  • berties: any english speaking playing on a taiwanese server?
  • web application development: has C# search volume really so constant over the years? really surprising.