Class CSVTokenizer
java.lang.Object
org.pentaho.reporting.libraries.base.util.CSVTokenizer
- All Implemented Interfaces:
Enumeration
The csv tokenizer class allows an application to break a Comma Separated Value format into tokens. The tokenization
method is much simpler than the one used by the
StringTokenizer
class. The CSVTokenizer
methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
The set of separator (the characters that separate tokens) may be specified either at creation time or on a per-token
basis.
An instance of CSVTokenizer
behaves in one of two ways, depending on whether it was created with the
returnSeparators
flag having the value true
or false
: - If the flag is
false
, delimiter characters serve to separate tokens. A token is a maximal sequence of consecutive characters that are not separator. - If the flag is
true
, delimiter characters are themselves considered to be tokens. A token is thus either one delimiter character, or a maximal sequence of consecutive characters that are not separator.
A CSVTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the characters processed.
A token is returned by taking a substring of the string that was used to create the CSVTokenizer object.
The following is one example of the use of the tokenizer. The code:prints the following output:CSVTokenizer csvt = new CSVTokenizer("this,is,a,test"); while (csvt.hasMoreTokens()) { println(csvt.nextToken()); }
this is a test
- Author:
- abupon
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
A possible quote character constant.static final String
A possible separator constant.static final String
A possible separator constant.static final String
A possible separator constant.static final String
A possible quote character constant. -
Constructor Summary
ConstructorDescriptionCSVTokenizer
(String aString) Constructs a string tokenizer for the specified string.CSVTokenizer
(String aString, boolean trim) Constructs a string tokenizer for the specified string.CSVTokenizer
(String aString, String theSeparator) Constructs a csv tokenizer for the specified string.CSVTokenizer
(String aString, String theSeparator, String theQuate) Constructs a csv tokenizer for the specified string.CSVTokenizer
(String aString, String theSeparator, String theQuate, boolean trim) Constructs a csv tokenizer for the specified string. -
Method Summary
Modifier and TypeMethodDescriptionint
Calculates the number of times that this tokenizer'snextToken
method can be called before it generates an exception.getQuate()
Returns the quate.boolean
Returns the same value as thehasMoreTokens
method.boolean
Tests if there are more tokens available from this tokenizer's string.Returns the same value as thenextToken
method, except that its declared return value isObject
rather thanString
.Returns the next token from this string tokenizer.Returns the next token in this string tokenizer's string.void
Sets the quate.String[]
toArray()
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface java.util.Enumeration
asIterator
-
Field Details
-
SEPARATOR_COMMA
A possible separator constant.- See Also:
-
SEPARATOR_TAB
A possible separator constant.- See Also:
-
SEPARATOR_SPACE
A possible separator constant.- See Also:
-
DOUBLE_QUATE
A possible quote character constant.- See Also:
-
SINGLE_QUATE
A possible quote character constant.- See Also:
-
-
Constructor Details
-
CSVTokenizer
Constructs a csv tokenizer for the specified string.theSeparator
argument is the separator for separating tokens. If thereturnSeparators
flag istrue
, then the separator string is also returned as tokens. separator is returned as a string. If the flag isfalse
, the separator string is skipped and only serve as separator between tokens.- Parameters:
aString
- a string to be parsed.theSeparator
- the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).theQuate
- the quate (CSVTokenizer.SINGLE_QUATE, CSVTokenizer.DOUBLE_QUATE, etc.).
-
CSVTokenizer
Constructs a csv tokenizer for the specified string.theSeparator
argument is the separator for separating tokens. If thereturnSeparators
flag istrue
, then the separator string is also returned as tokens. separator is returned as a string. If the flag isfalse
, the separator string is skipped and only serve as separator between tokens.- Parameters:
aString
- a string to be parsed.theSeparator
- the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).theQuate
- the quate (CSVTokenizer.SINGLE_QUATE, CSVTokenizer.DOUBLE_QUATE, etc.).
-
CSVTokenizer
Constructs a csv tokenizer for the specified string. The characters in thetheSeparator
argument are the separator for separating tokens. Separator string themselves will not be treated as tokens.- Parameters:
aString
- a string to be parsed.theSeparator
- the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).
-
CSVTokenizer
Constructs a string tokenizer for the specified string. The tokenizer uses the default separator set, which isCSVTokenizer.SEPARATOR_COMMA
. Separator string themselves will not be treated as tokens.- Parameters:
aString
- a string to be parsed.
-
CSVTokenizer
Constructs a string tokenizer for the specified string. The tokenizer uses the default separator set, which isCSVTokenizer.SEPARATOR_COMMA
. Separator string themselves will not be treated as tokens.- Parameters:
aString
- a string to be parsed.
-
-
Method Details
-
hasMoreTokens
public boolean hasMoreTokens()Tests if there are more tokens available from this tokenizer's string. If this method returns true, then a subsequent call to nextToken with no argument will successfully return a token.- Returns:
true
if and only if there is at least one token in the string after the current position;false
otherwise.
-
nextToken
Returns the next token from this string tokenizer.- Returns:
- the next token from this string tokenizer.
- Throws:
NoSuchElementException
- if there are no more tokens in this tokenizer's string.IllegalArgumentException
- if given parameter string format was wrong
-
nextToken
Returns the next token in this string tokenizer's string. First, the set of characters considered to be separator by this CSVTokenizer object is changed to be the characters in the string separator. Then the next token in the string after the current position is returned. The current position is advanced beyond the recognized token. The new delimiter set remains the default after this call.- Parameters:
theSeparator
- the new separator.- Returns:
- the next token, after switching to the new delimiter set.
- Throws:
NoSuchElementException
- if there are no more tokens in this tokenizer's string.
-
hasMoreElements
public boolean hasMoreElements()Returns the same value as thehasMoreTokens
method. It exists so that this class can implement theEnumeration
interface.- Specified by:
hasMoreElements
in interfaceEnumeration
- Returns:
true
if there are more tokens;false
otherwise.- See Also:
-
nextElement
Returns the same value as thenextToken
method, except that its declared return value isObject
rather thanString
. It exists so that this class can implement theEnumeration
interface.- Specified by:
nextElement
in interfaceEnumeration
- Returns:
- the next token in the string.
- Throws:
NoSuchElementException
- if there are no more tokens in this tokenizer's string.- See Also:
-
countTokens
public int countTokens()Calculates the number of times that this tokenizer'snextToken
method can be called before it generates an exception. The current position is not advanced.- Returns:
- the number of tokens remaining in the string using the current delimiter set.
- See Also:
-
getQuate
Returns the quate.- Returns:
- char
-
setQuate
Sets the quate.- Parameters:
quate
- The quate to set
-
toArray
-