TIP/Trick: How to retrieve a sub-string between two delimiters using C++

The std::string structure exists as an alternative tothe NULL terminated char array commonly used in C. Such structure provides methods to do simple string management tasks that are daunting and have to be achieved by using dynamic memorydirtytricks and memory leak prone functions. Unless, of course, you implement your own.

Last week I found myself wanting to retrieve a sub-string between two delimiters. People who write programs that parse HTML, XML or plain text files, might find this a common problem for which they may already use an existing solution from a library like boost.

Yet, with learning how everything works in mind I wrote my own.

The Code

The code to achieve this is fairly straight forward:

const std::string emptyString = "";
std::string ExtractString( std::string source, std::string start, std::string end )
{
     std::size_t startIndex = source.find( start );

     // If the starting delimiter is not found on the string
     // stop the process, you're done!
     //
     if( startIndex == std::string::npos )
     {
        return emptyString;
     }

     // Adding the length of the delimiter to our starting index
     // this will move us to the beginning of our sub-string.
     //
     startIndex += start.length();

     // Looking for the end delimiter
     //
     std::string::size_type endIndex = source.find( end, startIndex );

     // Returning the substring between the start index and
     // the end index. If the endindex is invalid then the
     // returned value is empty string.
     return source.substr( startIndex, endIndex - startIndex );
}

Client code call

    // Returns hello world
    //
    std::string str = "<div>hello world</div>";
    std::string extracted = ExtractString( str, "<div>", "</div>" );

    // Returns 12
    //
    std::string str = "(12)";
    std::string extracted = ExtractString( str, "(", ")" );

I hope you find this useful.