The C++ Boost Libraries Part 3 - string algorithms

Andrew Stephens, Monday the 19th of January, 2009

One of the many, many legitimate criticisms that could be leveled at C++ is that string handling is abysmal This post was automatically imported from my old blog. It may look a little weird since it was not originally written for this format.. Sure std::string can hold some chars for you, but there is a distinct lack of utility functions to actually do anything with those characters. Enter the Boost String Algorithm library, or string_algo to its many friends.

String_algo contains all those fiddly little functions that you write yourself to process text, as well as some clever extensions. The simple stuff is very simple:

string mixedCase = "Hello there";

string allLower = to_lower_copy(mixedCase); 
// mixedCase == "HELLO THERE"
// allLower  == "hello there"

string weirdFormat = "   .;text;;;;..   ";

// remove punctuation with a predicate 
trim_if(weirdFormat, is_punct());
// weirdFormat == "text"

The is_punct() function returns a predicate - string_algo defines a bunch of useful predicates so you will rarely have to roll your own. Not shown in the above is the ability to pass a std::locale to handle strings in other languages.

There is the full compliment of find, replace, and erase substring options:

string statement = "I like cheese, cheese, and cheese";
replace_nth( statement, "cheese", -2, "bacon" );
replace_last( statement, "cheese", "eggs" );
erase_first( statement, "cheese" );
erase_all( statement, "," );</div>

(the -2 parameter for replace_nth means "replace the 2nd from last occurance" - very useful)

And check this out:

vector<string> words;
string anthem = "God of nations! at Thy feet\n"
                "In the bonds of love we meet,";
split( words, anthem, is_any_of(" \n!,") );

I haven't even mentioned the ability to do fancy things with regular expression and the concept of find iterators that advance over the substrings that match. Also, the library is generic enough to work with any container that looks vaguely like a sequence of characters, not just std::string.

String_algo is replacing vast numbers of little hacked-together functions in my code. If nothing else it will reduce the temptation to use the God-forsaken CString classes.