Class ArffOutputData

java.lang.Object
org.pentaho.di.trans.step.BaseStepData
org.pentaho.di.arff.ArffOutputData
All Implemented Interfaces:
org.pentaho.di.trans.step.StepDataInterface

public class ArffOutputData extends org.pentaho.di.trans.step.BaseStepData implements org.pentaho.di.trans.step.StepDataInterface
Holds temporary data and has routines for writing the ARFF file. This class writes rows to a temporary file while, at the same time, collects values for nominal attributes in an array of Maps. Once the last row has been processed, the ARFF header is written and then the temporary file is appended.
Version:
1.0
Author:
Mark Hall (mhall{[at]}pentaho.org)
  • Field Details

    • m_outputRowMeta

      protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
    • m_outputFieldIndexes

      protected int[] m_outputFieldIndexes
    • m_outputSparseInstances

      protected boolean m_outputSparseInstances
      True if sparse data is to be output
    • m_weightFieldIndex

      protected int m_weightFieldIndex
      Index of the field used to set the weight for each instance (-1 means equal weights)
    • m_arffMeta

      protected org.pentaho.dm.commons.ArffMeta[] m_arffMeta
    • m_nominalVals

      protected Map<String,String>[] m_nominalVals
    • m_tempFile

      protected File m_tempFile
    • m_headerFile

      protected File m_headerFile
    • m_dataOut

      protected OutputStream m_dataOut
    • m_headerOut

      protected OutputStream m_headerOut
    • m_separator

      protected byte[] m_separator
    • m_newLine

      protected byte[] m_newLine
    • m_missing

      protected byte[] m_missing
    • m_leftCurly

      protected byte[] m_leftCurly
    • m_spaceLeftCurly

      protected byte[] m_spaceLeftCurly
    • m_rightCurly

      protected byte[] m_rightCurly
    • m_hasEncoding

      protected boolean m_hasEncoding
  • Constructor Details

    • ArffOutputData

      public ArffOutputData()
  • Method Details

    • getOutputRowMeta

      public org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
      Get the meta data for the output format
      Returns:
      a RowMetaInterface value
    • setOutputRowMeta

      public void setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
      Set the meta data for the output format
      Parameters:
      rmi - a RowMetaInterface value
    • setHasEncoding

      public void setHasEncoding(boolean e)
      Set whether an encoding is in use.
      Parameters:
      e - true if an encoding is in use
    • getHasEncoding

      public boolean getHasEncoding()
      Returns true if a specific character encoding is in use.
      Returns:
      true if an encoding other than the default encoding is in use.
    • setBinaryNewLine

      public void setBinaryNewLine(byte[] nl)
      Set the binary line terminator to use
      Parameters:
      nl - the line terminator
    • setBinarySeparator

      public void setBinarySeparator(byte[] s)
      Set the binary separator to use
      Parameters:
      s - binary field separator
    • setBinaryMissing

      public void setBinaryMissing(byte[] m)
      Set the binary missing value to use
      Parameters:
      m - binary missing value
    • setOutputFieldIndexes

      public void setOutputFieldIndexes(int[] outputFieldIndexes, org.pentaho.dm.commons.ArffMeta[] arffMeta)
      Set the indexes of the fields to output to the ARFF file
      Parameters:
      outputFieldIndexes - array of indexes
      arffMeta - array of arff metas
    • setWeightFieldIndex

      public void setWeightFieldIndex(int index)
      Set the index of the field whose values will be used to set the weight for each instance.
      Parameters:
      index - the index of the field to use to set instance-level weights.
    • setOutputSparseInstances

      public void setOutputSparseInstances(boolean s)
      Set whether to output instances in sparse format
      Parameters:
      s - true if instances are to be output in sparse format
    • openFiles

      public void openFiles(String filename) throws IOException
      Open files ready to write to
      Parameters:
      filename - the name of the ARFF file to write to
      Throws:
      IOException - if an error occurs
    • writeRow

      public void writeRow(Object[] r, String encoding) throws IOException, org.pentaho.di.core.exception.KettleStepException
      Convert and write a row of data to the ARFF file.
      Parameters:
      r - the Kettle row
      encoding - an (optional) character encoding to use
      Throws:
      IOException - if an error occurs
      org.pentaho.di.core.exception.KettleStepException - if an error occurs
    • finishOutput

      public void finishOutput(String relationName, String encoding) throws org.pentaho.di.core.exception.KettleStepException
      Writes the ARFF header and appends the temporary file
      Parameters:
      relationName - the ARFF relation name
      encoding - an (optional) character encoding
      Throws:
      org.pentaho.di.core.exception.KettleStepException - if an error occurs
    • closeFiles

      public void closeFiles() throws IOException
      Flush and close all files
      Throws:
      IOException - if an error occurs