Class CSVTokenizer

  • All Implemented Interfaces:
    Enumeration

    public class CSVTokenizer
    extends Object
    implements Enumeration
    The csv tokenizer class allows an application to break a Comma Separated Value format into tokens. The tokenization method is much simpler than the one used by the StringTokenizer class. The CSVTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.

    The set of separator (the characters that separate tokens) may be specified either at creation time or on a per-token basis.

    An instance of CSVTokenizer behaves in one of two ways, depending on whether it was created with the returnSeparators flag having the value true or false:

    • If the flag is false, delimiter characters serve to separate tokens. A token is a maximal sequence of consecutive characters that are not separator.
    • If the flag is true, delimiter characters are themselves considered to be tokens. A token is thus either one delimiter character, or a maximal sequence of consecutive characters that are not separator.

    A CSVTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the characters processed.

    A token is returned by taking a substring of the string that was used to create the CSVTokenizer object.

    The following is one example of the use of the tokenizer. The code:

         CSVTokenizer csvt = new CSVTokenizer("this,is,a,test");
         while (csvt.hasMoreTokens()) {
             println(csvt.nextToken());
         }
     

    prints the following output:

         this
         is
         a
         test
     
    Author:
    abupon
    • Constructor Detail

      • CSVTokenizer

        public CSVTokenizer​(String aString,
                            String theSeparator,
                            String theQuate)
        Constructs a csv tokenizer for the specified string. theSeparator argument is the separator for separating tokens.

        If the returnSeparators flag is true, then the separator string is also returned as tokens. separator is returned as a string. If the flag is false, the separator string is skipped and only serve as separator between tokens.

        Parameters:
        aString - a string to be parsed.
        theSeparator - the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).
        theQuate - the quate (CSVTokenizer.SINGLE_QUATE, CSVTokenizer.DOUBLE_QUATE, etc.).
      • CSVTokenizer

        public CSVTokenizer​(String aString,
                            String theSeparator,
                            String theQuate,
                            boolean trim)
        Constructs a csv tokenizer for the specified string. theSeparator argument is the separator for separating tokens.

        If the returnSeparators flag is true, then the separator string is also returned as tokens. separator is returned as a string. If the flag is false, the separator string is skipped and only serve as separator between tokens.

        Parameters:
        aString - a string to be parsed.
        theSeparator - the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).
        theQuate - the quate (CSVTokenizer.SINGLE_QUATE, CSVTokenizer.DOUBLE_QUATE, etc.).
      • CSVTokenizer

        public CSVTokenizer​(String aString,
                            String theSeparator)
        Constructs a csv tokenizer for the specified string. The characters in the theSeparator argument are the separator for separating tokens. Separator string themselves will not be treated as tokens.
        Parameters:
        aString - a string to be parsed.
        theSeparator - the separator (CSVTokenizer.SEPARATOR_COMMA, CSVTokenizer.TAB, CSVTokenizer.SPACE, etc.).
      • CSVTokenizer

        public CSVTokenizer​(String aString)
        Constructs a string tokenizer for the specified string. The tokenizer uses the default separator set, which is CSVTokenizer.SEPARATOR_COMMA. Separator string themselves will not be treated as tokens.
        Parameters:
        aString - a string to be parsed.
      • CSVTokenizer

        public CSVTokenizer​(String aString,
                            boolean trim)
        Constructs a string tokenizer for the specified string. The tokenizer uses the default separator set, which is CSVTokenizer.SEPARATOR_COMMA. Separator string themselves will not be treated as tokens.
        Parameters:
        aString - a string to be parsed.
    • Method Detail

      • hasMoreTokens

        public boolean hasMoreTokens()
        Tests if there are more tokens available from this tokenizer's string. If this method returns true, then a subsequent call to nextToken with no argument will successfully return a token.
        Returns:
        true if and only if there is at least one token in the string after the current position; false otherwise.
      • nextToken

        public String nextToken​(String theSeparator)
        Returns the next token in this string tokenizer's string. First, the set of characters considered to be separator by this CSVTokenizer object is changed to be the characters in the string separator. Then the next token in the string after the current position is returned. The current position is advanced beyond the recognized token. The new delimiter set remains the default after this call.
        Parameters:
        theSeparator - the new separator.
        Returns:
        the next token, after switching to the new delimiter set.
        Throws:
        NoSuchElementException - if there are no more tokens in this tokenizer's string.
      • hasMoreElements

        public boolean hasMoreElements()
        Returns the same value as the hasMoreTokens method. It exists so that this class can implement the Enumeration interface.
        Specified by:
        hasMoreElements in interface Enumeration
        Returns:
        true if there are more tokens; false otherwise.
        See Also:
        Enumeration, hasMoreTokens()
      • nextElement

        public Object nextElement()
        Returns the same value as the nextToken method, except that its declared return value is Object rather than String. It exists so that this class can implement the Enumeration interface.
        Specified by:
        nextElement in interface Enumeration
        Returns:
        the next token in the string.
        Throws:
        NoSuchElementException - if there are no more tokens in this tokenizer's string.
        See Also:
        Enumeration, nextToken()
      • countTokens

        public int countTokens()
        Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception. The current position is not advanced.
        Returns:
        the number of tokens remaining in the string using the current delimiter set.
        See Also:
        nextToken()
      • getQuate

        public String getQuate()
        Returns the quate.
        Returns:
        char
      • setQuate

        public void setQuate​(String quate)
        Sets the quate.
        Parameters:
        quate - The quate to set
      • toArray

        public String[] toArray()