org.pentaho.di.core
Class SimpleTokenizer

java.lang.Object
  extended by org.pentaho.di.core.SimpleTokenizer

public class SimpleTokenizer
extends Object

The SimpleTokenizer class is used to break a string into tokens.

The delimiter can be used in one of two ways, depending on how the singleDelimiter flag is set:

The total number of tokens in the text is equal to the number of delimeters found plus one. An empty token is returned when:

You can use the tokenizer like the StringTokenizer:

     SimpleTokenizer st = new SimpleTokenizer("this is a test", " ");
     while (st.hasMoreTokens())
         println(st.nextToken());
 

Or, you can use the tokenizer like the String.split(...) method:

     SimpleTokenizer st = new SimpleTokenizer("this is a test", " ");
     List list = st.getAllTokens();
     for (java.util.Iterator it = list.iterator(); it.hasNext();)
         println(it.next());
 

See Also:
StringTokenizer, String.split(java.lang.String, int)

Constructor Summary
SimpleTokenizer(String text, String delimiter)
          Constructs a tokenizer for the specified string.
SimpleTokenizer(String text, String delimiter, boolean singleDelimiter)
          Constructs a tokenizer for the specified string.
 
Method Summary
 List<String> getAllTokens()
          Tokenize the remaining text and return all the tokens
 String getRemainder()
          Get the text that has not yet been tokenized.
 boolean hasMoreTokens()
          Tests if there are more tokens available from this tokenizer.
 String nextToken()
          Returns the next token from this tokenizer.
 String nextToken(int tokenCount)
          Returns the nth token from the current position of this tokenizer.
 void setDelimiter(String delimiter)
          Set the delimiter(s) used to parse the text.
 void setText(String text)
          Set the text to be tokenized.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleTokenizer

public SimpleTokenizer(String text,
                       String delimiter)
Constructs a tokenizer for the specified string. Each character in the delimiter string is treated as a delimiter.

Parameters:
text - a string to be tokenized.
delimiter - the delimiter.

SimpleTokenizer

public SimpleTokenizer(String text,
                       String delimiter,
                       boolean singleDelimiter)
Constructs a tokenizer for the specified string.

If the singleDelmiter flag is true, then the delimiter string is used as a single delimiter. If the flag is false, the each character in the delimiter is treated as a delimiter.

Parameters:
text - a string to be tokenized.
delimiter - the delimiter(s).
multipleDelimiters - treat each character as a delimiter.
Method Detail

setText

public void setText(String text)
Set the text to be tokenized.

Parameters:
text - a string to be tokenized.

setDelimiter

public void setDelimiter(String delimiter)
Set the delimiter(s) used to parse the text. The delimiter can be reset before retrieving the next token.

Parameters:
delimiter - the delimiter.

hasMoreTokens

public boolean hasMoreTokens()
Tests if there are more tokens available from this tokenizer. A subsequent call to nextToken will return a token.

Returns:
true when there is at least one token remaining; false otherwise.

nextToken

public String nextToken()
Returns the next token from this tokenizer.

Returns:
the next token from this tokenizer.
Throws:
NoSuchElementException - if there are no more tokens

nextToken

public String nextToken(int tokenCount)
Returns the nth token from the current position of this tokenizer. This is equivalent to advancing n-1 tokens and returning the nth token.

Parameters:
tokenCount - the relative position of the token requested
Returns:
the token found at the requested relative position.
Throws:
NoSuchElementException - if there are no more tokens

getRemainder

public String getRemainder()
Get the text that has not yet been tokenized.

Returns:
the remainder of the text to be tokenized

getAllTokens

public List<String> getAllTokens()
Tokenize the remaining text and return all the tokens

Returns:
a List containing all of the individual tokens