Home About

February 11th, 2009

Manipulating Strings in C# – Finding all occurrences of a string within another string - 4

A common programming problem is to find the position of all copies of a string in another string. For finding the first copy the C# string method IndexOf is similar to the C strpos() function. It returns the first occurrence of a string in another string. But what if you would like to find the position of all occurances of the substring? The following “IndexOfAll” method does just that. It returns an IEnumerable containing the offsets of each sub-string in the main string.

Because you might want to use this code throughout your project it is implemented as an Extension class. Simply put: the IndexOfAll method is attached to the String class. So if we want to call it we can just use .IndexOfAll(needle)

To be able to define an extension we need to create a static method and put it into a static class. The first parameter of the method identifies the class the method should associate with. In our case: string. We do this by defining it as “this string”.

using System;
using System.Collections;

namespace StringItems
{
    static class StringExt
    {
        public static IEnumerable IndexOfAll(this string haystack, string needle)
        {
            int pos,offset = 0;
            while ((pos = haystack.IndexOf(needle))>0)
            {
                haystack = haystack.Substring(pos+needle.Length);
                offset += pos;
                yield return offset;
            }
        }
    }

    class MainClass
    {
        public static void Main(string[] args)
        {
            string needle = "x";
            string haystack = "3 x 4 = 2 x 6 = 1 x 12";
            foreach(int Pos in haystack.IndexOfAll(needle))
                Console.WriteLine("Offset: {0}",Pos);
        }
    }
}
Be Sociable, Share!

Tags: , ,

4 Responses to “Manipulating Strings in C# – Finding all occurrences of a string within another string”

  1. kotelni Says:

    I found IndexOfAll metod call very useful. I like the idea of using yield return.
    This is slightly better version of IndexOfAll, I thought it is worth sharing.
    It does not do Substring in a loop which can give a big performance improvement for large strings and (I think) there was a bug in original version that in ‘while’ comparison must be for >=0 not just >0.

    protected static IEnumerable IndexOfAll(string haystack, string needle)
    {
    int pos;
    int offset = 0;
    int length = needle.Length;
    while ((pos = haystack.IndexOf(needle, offset)) != -1)
    {
    yield return pos;
    offset = pos + length;
    }
    }

  2. Guy Ellis Says:

    Yes – I can confirm that the comparison must be >= and not > otherwise this will fail in cases where the occurrence is at the start of the string.

  3. Lasse Espeholt Says:

    I don’t think these algorithms is very efficient. Instead I think you should look at some algorithms at http://en.wikipedia.org/wiki/List_of_algorithms#Substrings which is designed find multiple substrings instead of using a naive IndexOf loop.

    Best regards, Lasse Espeholt

  4. Hoyb Says:

    Wow!

    Thanks a ton to Kotelni for finding the bug! I just spent 25 minutes trying to figure out why the code above wouldn’t work right (thanks for the effort though, original poster), but I just pasted Kotelni’s code in, and boom, working indexofall function. Thanks again!! You rock!


Most popular

    Sorry. No data so far.

Recent Comments
  • Juan Romero: Hi there, it’s a neat little class, but I believe you could do the same thing with the WebClient...
  • anthosh: Hey, THank you very much for your tutorial. It was awesome. But i have a problem that i am not able to...
  • bian: how to get passphase if i have encrypt and decrypt string?? Thanks alot
  • Michael: Hi, I really like your post, thanks a lot, it really helped clear up a few things I could not remember how...
  • Bharat Prajapati: i was trying to import keyword dictionary to this plugin which is in csv format, but i get an error...