StringTokenizer Class Reference

#include <StringTokenizer.h>


Detailed Description

StringTokenizer A class similar to the StringTokenizer from Java. It splits a string at the given string or character (or one of the special cases NEWLINE or WHITECHAR) and allows to iterate over the so generated substrings.

The normal usage is like this:

 StringTokenizer st(CString("This is a line"), ' ');
 while(st.hasNext())
    cout << st.next() << endl;
 
This would generate the output:
 This
 is
 a
 line
 

There is something to know about the behaviour: When using WHITECHAR, a list of whitechars occuring in the string to split is regarded as a single divider. All other parameter will use multiple occurences of operators as a list of single divider and the string between them will have a length of zero.

Definition at line 67 of file StringTokenizer.h.


Public Member Functions

std::string front ()
std::string get (size_t pos) const
std::vector< std::string > getVector ()
bool hasNext ()
std::string next ()
void reinit ()
size_t size () const
 StringTokenizer (std::string tosplit, int special)
 constructor When StringTokenizer.NEWLINE is used as second parameter, the string will be split at all occurences of a newline character (0x0d / 0x0a) When StringTokenizer.WHITECHARS is used as second parameter, the string will be split at all characters below 0x20 All other ints specified as second parameter are casted int o a char at which the string will be splitted.
 StringTokenizer (std::string tosplit, std::string token, bool splitAtAllChars=false)
 constructor the first string will be split at the second string's occurences. If the optional third parameter is true, the string will be split whenever a char from the second string occurs. If the string between two split positions is empty, it will nevertheless be returned.
 StringTokenizer (std::string tosplit)
 constructor same as StringTokenizer(tosplit, StringTokenizer.WHITECHARS) tosplit is the string to split into substrings. If the string between two split positions is empty, it will not be returned.
 StringTokenizer ()
 ~StringTokenizer ()

Static Public Attributes

static const int NEWLINE = -256
static const int WHITECHARS = -257

Private Types

typedef std::vector< size_tSizeVector

Private Member Functions

void prepare (const std::string &tosplit, const std::string &token, bool splitAtAllChars)
void prepareWhitechar (const std::string &tosplit)

Private Attributes

SizeVector myLengths
size_t myPos
SizeVector myStarts
std::string myTosplit

Member Typedef Documentation

typedef std::vector<size_t> StringTokenizer::SizeVector [private]

a list of positions/lengths

Definition at line 137 of file StringTokenizer.h.


Constructor & Destructor Documentation

StringTokenizer::StringTokenizer (  )  [inline]

default constructor

Definition at line 78 of file StringTokenizer.h.

00078 { }

StringTokenizer::StringTokenizer ( std::string  tosplit  ) 

constructor same as StringTokenizer(tosplit, StringTokenizer.WHITECHARS) tosplit is the string to split into substrings. If the string between two split positions is empty, it will not be returned.

Definition at line 51 of file StringTokenizer.cpp.

References prepareWhitechar().

00052         : myTosplit(tosplit), myPos(0) {
00053     prepareWhitechar(tosplit);
00054 }

StringTokenizer::StringTokenizer ( std::string  tosplit,
std::string  token,
bool  splitAtAllChars = false 
)

constructor the first string will be split at the second string's occurences. If the optional third parameter is true, the string will be split whenever a char from the second string occurs. If the string between two split positions is empty, it will nevertheless be returned.

Definition at line 57 of file StringTokenizer.cpp.

References prepare().

00058         : myTosplit(tosplit), myPos(0) {
00059     prepare(tosplit, token, splitAtAllChars);
00060 }

StringTokenizer::StringTokenizer ( std::string  tosplit,
int  special 
)

constructor When StringTokenizer.NEWLINE is used as second parameter, the string will be split at all occurences of a newline character (0x0d / 0x0a) When StringTokenizer.WHITECHARS is used as second parameter, the string will be split at all characters below 0x20 All other ints specified as second parameter are casted int o a char at which the string will be splitted.

Definition at line 63 of file StringTokenizer.cpp.

References NEWLINE, prepare(), prepareWhitechar(), and WHITECHARS.

00064         : myTosplit(tosplit), myPos(0) {
00065     switch (special) {
00066     case NEWLINE:
00067         prepare(tosplit, "\r\n", true);
00068         break;
00069     case WHITECHARS:
00070         prepareWhitechar(tosplit);
00071         break;
00072     default:
00073         char *buf = new char[2];
00074         buf[0] = (char) special;
00075         buf[1] = 0;
00076         prepare(tosplit, buf, false);
00077         delete[] buf;
00078         break;
00079     }
00080 }

StringTokenizer::~StringTokenizer (  ) 

destructor

Definition at line 83 of file StringTokenizer.cpp.

00083 {}


Member Function Documentation

std::string StringTokenizer::front (  ) 

returns the first substring without moving the iterator

Definition at line 106 of file StringTokenizer.cpp.

References myLengths, myStarts, and myTosplit.

Referenced by TEST().

00106                                  {
00107     if (myStarts.size()==0) {
00108         throw OutOfBoundsException();
00109     }
00110     if (myLengths[0]==0) {
00111         return "";
00112     }
00113     return myTosplit.substr(myStarts[0],myLengths[0]);
00114 }

std::string StringTokenizer::get ( size_t  pos  )  const

returns the item at the given position

Definition at line 116 of file StringTokenizer.cpp.

References myLengths, myStarts, and myTosplit.

Referenced by NamedColumnsParser::get(), and TEST().

00116                                                {
00117     if (pos>=myStarts.size()) {
00118         throw OutOfBoundsException();
00119     }
00120     if (myLengths[pos]==0) {
00121         return "";
00122     }
00123     size_t start = myStarts[pos];
00124     size_t length = myLengths[pos];
00125     return myTosplit.substr(start, length);
00126 }

std::vector< std::string > StringTokenizer::getVector (  ) 

Definition at line 180 of file StringTokenizer.cpp.

References hasNext(), next(), reinit(), and size().

Referenced by OptionsCont::getStringVector(), PCLoaderDlrNavteq::loadPolyFile(), MSRouteHandler::myStartElement(), MSEdge::parseEdgesList(), NIImporter_DlrNavteq::TrafficlightsHandler::report(), NIImporter_DlrNavteq::EdgesHandler::report(), and TEST().

00180                            {
00181     std::vector<std::string> ret;
00182     ret.reserve(size());
00183     while (hasNext()) {
00184         ret.push_back(next());
00185     }
00186     reinit();
00187     return ret;
00188 }

bool StringTokenizer::hasNext (  ) 

std::string StringTokenizer::next (  ) 

void StringTokenizer::prepare ( const std::string &  tosplit,
const std::string &  token,
bool  splitAtAllChars 
) [private]

splits the first string at all occurences of the second. If the third parameter is true split at all chars given in the second

Definition at line 133 of file StringTokenizer.cpp.

References myLengths, and myStarts.

Referenced by StringTokenizer().

00133                                                                                                     {
00134     size_t beg = 0;
00135     size_t len = token.length();
00136     if (splitAtAllChars) {
00137         len = 1;
00138     }
00139     while (beg<tosplit.length()) {
00140         size_t end;
00141         if (splitAtAllChars) {
00142             end = tosplit.find_first_of(token, beg);
00143         } else {
00144             end = tosplit.find(token, beg);
00145         }
00146         if (end == std::string::npos) {
00147             end = tosplit.length();
00148         }
00149         myStarts.push_back(beg);
00150         myLengths.push_back(end-beg);
00151         beg = end + len;
00152         if (beg==tosplit.length()) {
00153             myStarts.push_back(beg-1);
00154             myLengths.push_back(0);
00155         }
00156     }
00157 }

void StringTokenizer::prepareWhitechar ( const std::string &  tosplit  )  [private]

splits the first string at all occurences of whitechars

Definition at line 159 of file StringTokenizer.cpp.

References myLengths, and myStarts.

Referenced by StringTokenizer().

00159                                                                {
00160     size_t len = tosplit.length();
00161     size_t beg = 0;
00162     while (beg<len&&tosplit[beg]<=32) {
00163         beg++;
00164     }
00165     while (beg!=std::string::npos&&beg<len) {
00166         size_t end = beg;
00167         while (end<len&&tosplit[end]>32) {
00168             end++;
00169         }
00170         myStarts.push_back(beg);
00171         myLengths.push_back(end-beg);
00172         beg = end;
00173         while (beg<len&&tosplit[beg]<=32) {
00174             beg++;
00175         }
00176     }
00177 }

void StringTokenizer::reinit (  ) 

reinitialises the internal iterator

Definition at line 85 of file StringTokenizer.cpp.

References myPos.

Referenced by getVector(), and TEST().

00085                              {
00086     myPos = 0;
00087 }

size_t StringTokenizer::size (  )  const

returns the number of existing substrings

Definition at line 129 of file StringTokenizer.cpp.

References myStarts.

Referenced by NamedColumnsParser::get(), getVector(), NamedColumnsParser::know(), GeomConvHelper::parseBoundaryReporting(), RGBColor::parseColor(), NIXMLConnectionsHandler::parseLaneBound(), parseTimeLine(), readO(), and TEST().

00129                                    {
00130     return myStarts.size();
00131 }


Field Documentation

the list of substring lengths

Definition at line 149 of file StringTokenizer.h.

Referenced by front(), get(), next(), prepare(), and prepareWhitechar().

the current position in the list of substrings

Definition at line 143 of file StringTokenizer.h.

Referenced by hasNext(), next(), and reinit().

the list of substring starts

Definition at line 146 of file StringTokenizer.h.

Referenced by front(), get(), hasNext(), next(), prepare(), prepareWhitechar(), and size().

std::string StringTokenizer::myTosplit [private]

the string to split

Definition at line 140 of file StringTokenizer.h.

Referenced by front(), get(), and next().

const int StringTokenizer::NEWLINE = -256 [static]

identifier for splitting the given string at all newline characters

Definition at line 70 of file StringTokenizer.h.

Referenced by StringTokenizer(), and TEST().

const int StringTokenizer::WHITECHARS = -257 [static]

identifier for splitting the given string at all whitespace characters

Definition at line 74 of file StringTokenizer.h.

Referenced by ROJTRTurnDefLoader::myStartElement(), readO(), readTime(), readV(), NIImporter_DlrNavteq::TrafficlightsHandler::report(), NIImporter_DlrNavteq::EdgesHandler::report(), StringTokenizer(), and TEST().


The documentation for this class was generated from the following files:

Generated on Wed May 5 00:07:00 2010 for Sumo - Simulation of Urban MObility by  doxygen 1.5.6