Class BaseStep

java.lang.Object
org.pentaho.di.trans.step.BaseStep
All Implemented Interfaces:
org.pentaho.di.core.ExtensionDataInterface, HasLogChannelInterface, org.pentaho.di.core.logging.LoggingObjectInterface, org.pentaho.di.core.logging.LoggingObjectLifecycleInterface, org.pentaho.di.core.variables.VariableSpace, StepInterface
Direct Known Subclasses:
AbstractStep, BaseDatabaseStep, BaseFileInputStep, BaseStreamStep, Calculator, Constant, CsvInput, DatabaseLookup, DataGrid, Denormaliser, DetectEmptyStream, DetectLastRow, DummyTrans, ExecProcess, ExecSQL, FieldsChangeSequence, FieldSplitter, FileExists, FileLocked, FilesFromResult, FilesToResult, FilterRows, FixedInput, Flattener, Formula, FuzzyMatch, GetFileNames, GetFilesRowsCount, GetSlaveSequence, GetSubFolders, GetVariable, GroupBy, HTTP, HTTPPOST, IfNull, Injector, Janino, JavaFilter, JobExecutor, JoinRows, LDIFInput, LoadFileInput, Mapping, MappingInput, MappingOutput, MemoryGroupBy, MergeJoin, MergeRows, MultiMergeJoin, Normaliser, NullIf, NumberRange, OlapInput, ParGzipCsvInput, PGPDecryptStream, PGPEncryptStream, PrioritizeStreams, ProcessFiles, PropertyInput, PropertyOutput, RandomValue, RegexEval, ReplaceString, ReservoirSampling, RowGenerator, RowsFromResult, RowsToResult, SampleRows, SasInput, Script, ScriptValuesMod, SecretKeyGenerator, SelectValues, SetValueConstant, SetValueField, SetVariable, SimpleMapping, SingleThreader, SocketReader, SocketWriter, SortedMerge, SortRows, SplitFieldToRows, SQLFileOutput, SSH, StepMetastructure, StepsMetrics, StreamLookup, StringCut, StringOperations, SwitchCase, SymmetricCryptoTrans, SyslogMessage, SystemData, TableCompare, TextFileInput, TextFileOutput, TransExecutor, UniqueRows, UniqueRowsByHashSet, UnivariateStats, UserDefinedJavaClass, Validator, ValueMapper, WebService, WebServiceAvailable, WriteToLog, XBaseInput, ZipFile

public class BaseStep extends Object implements org.pentaho.di.core.variables.VariableSpace, StepInterface, org.pentaho.di.core.logging.LoggingObjectInterface, org.pentaho.di.core.ExtensionDataInterface
This class can be extended for the actual row processing of the implemented step.

The implementing class can rely mostly on the base class, and has only three important methods it implements itself. The three methods implement the step lifecycle during transformation execution: initialization, row processing, and clean-up.

  • Step Initialization
    The init() method is called when a transformation is preparing to start execution.

     public boolean init(...)
     

    Every step is given the opportunity to do one-time initialization tasks like opening files or establishing database connections. For any steps derived from BaseStep it is mandatory that super.init() is called to ensure correct behavior. The method must return true in case the step initialized correctly, it must returned false if there was an initialization error. PDI will abort the execution of a transformation in case any step returns false upon initialization.

  • Row Processing
    Once the transformation starts execution it enters a tight loop calling processRow() on each step until the method returns false. Each step typically reads a single row from the input stream, alters the row structure and fields and passes the row on to next steps.

     public boolean processRow(...)
     

    A typical implementation queries for incoming input rows by calling getRow(), which blocks and returns a row object or null in case there is no more input. If there was an input row, the step does the necessary row processing and calls putRow() to pass the row on to the next step. If there are no more rows, the step must call setOutputDone() and return false.

    Formally the method must conform to the following rules:

    • If the step is done processing all rows, the method must call setOutputDone() and return false
    • If the step is not done processing all rows, the method must return true. PDI will call processRow() again in this case.
  • Step Clean-Up
    Once the transformation is complete, PDI calls dispose() on all steps.

     public void dispose(...)
     

    Steps are required to deallocate resources allocated during init() or subsequent row processing. This typically means to clear all fields of the StepDataInterface object, and to ensure that all open files or connections are properly closed. For any steps derived from BaseStep it is mandatory that super.dispose() is called to ensure correct deallocation.

  • Field Details

  • Constructor Details

    • BaseStep

      public BaseStep(StepMeta stepMeta, StepDataInterface stepDataInterface, int copyNr, TransMeta transMeta, Trans trans)
      This is the base step that forms that basis for all steps. You can derive from this class to implement your own steps.
      Parameters:
      stepMeta - The StepMeta object to run.
      stepDataInterface - the data object to store temporary data, database connections, caches, result sets, hashtables etc.
      copyNr - The copynumber for this step.
      transMeta - The TransInfo of which the step stepMeta is part of.
      trans - The (running) transformation to obtain information shared among the steps.
  • Method Details

    • init

      public boolean init(StepMetaInterface smi, StepDataInterface sdi)
      Description copied from interface: StepInterface
      Initialize and do work where other steps need to wait for...
      Specified by:
      init in interface StepInterface
      Parameters:
      smi - The metadata to work with
      sdi - The data to initialize
    • dispose

      public void dispose(StepMetaInterface smi, StepDataInterface sdi)
      Description copied from interface: StepInterface
      Dispose of this step: close files, empty logs, etc.
      Specified by:
      dispose in interface StepInterface
      Parameters:
      smi - The metadata to work with
      sdi - The data to dispose of
    • cleanup

      public void cleanup()
      Description copied from interface: StepInterface
      Call this method typically, after ALL the slave transformations in a clustered run have finished.
      Specified by:
      cleanup in interface StepInterface
    • getProcessed

      public long getProcessed()
      Specified by:
      getProcessed in interface StepInterface
      Returns:
      The number of "processed" lines of a step. Well, a representable metric for that anyway.
    • setCopy

      public void setCopy(int cop)
      Sets the copy.
      Parameters:
      cop - the new copy
    • getCopy

      public int getCopy()
      Specified by:
      getCopy in interface StepInterface
      Returns:
      The steps copy number (default 0)
    • getErrors

      public long getErrors()
      Description copied from interface: StepInterface
      Get the number of errors
      Specified by:
      getErrors in interface StepInterface
      Returns:
      the number of errors
    • setErrors

      public void setErrors(long e)
      Description copied from interface: StepInterface
      Sets the number of errors
      Specified by:
      setErrors in interface StepInterface
      Parameters:
      e - the number of errors to set
    • getLinesRead

      public long getLinesRead()
      Specified by:
      getLinesRead in interface StepInterface
      Returns:
      Returns the number of lines read from previous steps
    • incrementLinesRead

      public long incrementLinesRead()
      Increments the number of lines read from previous steps by one
      Returns:
      Returns the new value
    • decrementLinesRead

      public long decrementLinesRead()
      Decrements the number of lines read from previous steps by one
      Returns:
      Returns the new value
    • setLinesRead

      public void setLinesRead(long newLinesReadValue)
      Parameters:
      newLinesReadValue - the new number of lines read from previous steps
    • getLinesInput

      public long getLinesInput()
      Specified by:
      getLinesInput in interface StepInterface
      Returns:
      Returns the number of lines read from an input source: database, file, socket, etc.
    • incrementLinesInput

      public long incrementLinesInput()
      Increments the number of lines read from an input source: database, file, socket, etc.
      Returns:
      the new incremented value
    • setLinesInput

      public void setLinesInput(long newLinesInputValue)
      Parameters:
      newLinesInputValue - the new number of lines read from an input source: database, file, socket, etc.
    • getLinesOutput

      public long getLinesOutput()
      Specified by:
      getLinesOutput in interface StepInterface
      Returns:
      Returns the number of lines written to an output target: database, file, socket, etc.
    • incrementLinesOutput

      public long incrementLinesOutput()
      Increments the number of lines written to an output target: database, file, socket, etc.
      Returns:
      the new incremented value
    • setLinesOutput

      public void setLinesOutput(long newLinesOutputValue)
      Parameters:
      newLinesOutputValue - the new number of lines written to an output target: database, file, socket, etc.
    • getLinesWritten

      public long getLinesWritten()
      Specified by:
      getLinesWritten in interface StepInterface
      Returns:
      Returns the linesWritten.
    • incrementLinesWritten

      public long incrementLinesWritten()
      Increments the number of lines written to next steps by one
      Returns:
      Returns the new value
    • decrementLinesWritten

      public long decrementLinesWritten()
      Decrements the number of lines written to next steps by one
      Returns:
      Returns the new value
    • setLinesWritten

      public void setLinesWritten(long newLinesWrittenValue)
      Parameters:
      newLinesWrittenValue - the new number of lines written to next steps
    • getLinesUpdated

      public long getLinesUpdated()
      Specified by:
      getLinesUpdated in interface StepInterface
      Returns:
      Returns the number of lines updated in an output target: database, file, socket, etc.
    • incrementLinesUpdated

      public long incrementLinesUpdated()
      Increments the number of lines updated in an output target: database, file, socket, etc.
      Returns:
      the new incremented value
    • setLinesUpdated

      public void setLinesUpdated(long newLinesUpdatedValue)
      Parameters:
      newLinesUpdatedValue - the new number of lines updated in an output target: database, file, socket, etc.
    • getLinesRejected

      public long getLinesRejected()
      Specified by:
      getLinesRejected in interface StepInterface
      Returns:
      the number of lines rejected to an error handling step
    • incrementLinesRejected

      public long incrementLinesRejected()
      Increments the number of lines rejected to an error handling step
      Returns:
      the new incremented value
    • setLinesRejected

      public void setLinesRejected(long newLinesRejectedValue)
      Specified by:
      setLinesRejected in interface StepInterface
      Parameters:
      newLinesRejectedValue - lines number of lines rejected to an error handling step
    • getLinesSkipped

      public long getLinesSkipped()
      Returns:
      the number of lines skipped
    • incrementLinesSkipped

      public long incrementLinesSkipped()
      Increments the number of lines skipped
      Returns:
      the new incremented value
    • setLinesSkipped

      public void setLinesSkipped(long newLinesSkippedValue)
      Parameters:
      newLinesSkippedValue - lines number of lines skipped
    • getStepname

      public String getStepname()
      Description copied from interface: StepInterface
      Get the name of the step.
      Specified by:
      getStepname in interface StepInterface
      Returns:
      the name of the step
    • setStepname

      public void setStepname(String stepname)
      Sets the stepname.
      Parameters:
      stepname - the new stepname
    • getDispatcher

      public Trans getDispatcher()
      Gets the dispatcher.
      Returns:
      the dispatcher
    • getStatusDescription

      public String getStatusDescription()
      Gets the status description.
      Returns:
      the status description
    • getStepMetaInterface

      public StepMetaInterface getStepMetaInterface()
      Returns:
      Returns the stepMetaInterface.
    • setStepMetaInterface

      public void setStepMetaInterface(StepMetaInterface stepMetaInterface)
      Parameters:
      stepMetaInterface - The stepMetaInterface to set.
    • getStepDataInterface

      public StepDataInterface getStepDataInterface()
      Returns:
      Returns the stepDataInterface.
    • setStepDataInterface

      public void setStepDataInterface(StepDataInterface stepDataInterface)
      Parameters:
      stepDataInterface - The stepDataInterface to set.
    • getStepMeta

      public StepMeta getStepMeta()
      Specified by:
      getStepMeta in interface StepInterface
      Returns:
      Returns the stepMeta.
    • setStepMeta

      public void setStepMeta(StepMeta stepMeta)
      Parameters:
      stepMeta - The stepMeta to set.
    • getTransMeta

      public TransMeta getTransMeta()
      Returns:
      Returns the transMeta.
    • setTransMeta

      public void setTransMeta(TransMeta transMeta)
      Parameters:
      transMeta - The transMeta to set.
    • getTrans

      public Trans getTrans()
      Specified by:
      getTrans in interface StepInterface
      Returns:
      Returns the trans.
    • putRow

      public void putRow(org.pentaho.di.core.row.RowMetaInterface rowMeta, Object[] row) throws org.pentaho.di.core.exception.KettleStepException
      putRow is used to copy a row, to the alternate rowset(s) This should get priority over everything else! (synchronized) If distribute is true, a row is copied only once to the output rowsets, otherwise copies are sent to each rowset!
      Specified by:
      putRow in interface StepInterface
      Parameters:
      row - The row to put to the destination rowset(s).
      rowMeta - The row to send to the destinations steps
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • putRowTo

      public void putRowTo(org.pentaho.di.core.row.RowMetaInterface rowMeta, Object[] row, org.pentaho.di.core.RowSet rowSet) throws org.pentaho.di.core.exception.KettleStepException
      putRowTo is used to put a row in a certain specific RowSet.
      Parameters:
      rowMeta - The row meta-data to put to the destination RowSet.
      row - the data to put in the RowSet
      rowSet - the RoWset to put the row into.
      Throws:
      org.pentaho.di.core.exception.KettleStepException - In case something unexpected goes wrong
    • handlePutRowTo

      public void handlePutRowTo(org.pentaho.di.core.row.RowMetaInterface rowMeta, Object[] row, org.pentaho.di.core.RowSet rowSet) throws org.pentaho.di.core.exception.KettleStepException
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • putError

      public void putError(org.pentaho.di.core.row.RowMetaInterface rowMeta, Object[] row, long nrErrors, String errorDescriptions, String fieldNames, String errorCodes) throws org.pentaho.di.core.exception.KettleStepException
      Put error.
      Parameters:
      rowMeta - the row meta
      row - the row
      nrErrors - the nr errors
      errorDescriptions - the error descriptions
      fieldNames - the field names
      errorCodes - the error codes
      Throws:
      org.pentaho.di.core.exception.KettleStepException - the kettle step exception
    • waitUntilTransformationIsStarted

      protected void waitUntilTransformationIsStarted()
      Wait until the transformation is completely running and all threads have been started.
    • getRow

      public Object[] getRow() throws org.pentaho.di.core.exception.KettleException
      In case of getRow, we receive data from previous steps through the input rowset. In case we split the stream, we have to copy the data to the alternate splits: rowsets 1 through n.
      Specified by:
      getRow in interface StepInterface
      Returns:
      a row from the source step(s).
      Throws:
      org.pentaho.di.core.exception.KettleException
    • setRowHandler

      public void setRowHandler(RowHandler rowHandler)
      RowHandler controls how getRow/putRow are handled. The default RowHandler will simply call handleGetRow() and handlePutRow(RowMetaInterface, Object[])
    • getRowHandler

      public RowHandler getRowHandler()
    • openRemoteInputStepSocketsOnce

      protected void openRemoteInputStepSocketsOnce() throws org.pentaho.di.core.exception.KettleStepException
      Opens socket connections to the remote input steps of this step.
      This method should be used by steps that don't call getRow() first in which it is executed automatically.
      This method should be called before any data is read from previous steps.
      This action is executed only once.
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • openRemoteOutputStepSocketsOnce

      protected void openRemoteOutputStepSocketsOnce() throws org.pentaho.di.core.exception.KettleStepException
      Opens socket connections to the remote output steps of this step.
      This method is called in method initBeforeStart() because it needs to connect to the server sockets (remote steps) as soon as possible to avoid time-out situations.
      This action is executed only once.
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • safeModeChecking

      protected void safeModeChecking(org.pentaho.di.core.row.RowMetaInterface row) throws org.pentaho.di.core.exception.KettleRowException
      Safe mode checking.
      Parameters:
      row - the row
      Throws:
      org.pentaho.di.core.exception.KettleRowException - the kettle row exception
    • identifyErrorOutput

      public void identifyErrorOutput()
      Description copied from interface: StepInterface
      To be used to flag an error output channel of a step prior to execution for performance reasons.
      Specified by:
      identifyErrorOutput in interface StepInterface
    • safeModeChecking

      public static void safeModeChecking(org.pentaho.di.core.row.RowMetaInterface referenceRowMeta, org.pentaho.di.core.row.RowMetaInterface rowMeta) throws org.pentaho.di.core.exception.KettleRowException
      Safe mode checking.
      Parameters:
      referenceRowMeta - the reference row meta
      rowMeta - the row meta
      Throws:
      org.pentaho.di.core.exception.KettleRowException - the kettle row exception
    • getRowFrom

      public Object[] getRowFrom(org.pentaho.di.core.RowSet rowSet) throws org.pentaho.di.core.exception.KettleStepException
      Gets the row from.
      Parameters:
      rowSet - the row set
      Returns:
      the row from
      Throws:
      org.pentaho.di.core.exception.KettleStepException - the kettle step exception
    • handleGetRowFrom

      public Object[] handleGetRowFrom(org.pentaho.di.core.RowSet rowSet) throws org.pentaho.di.core.exception.KettleStepException
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • verifyInputDeadLock

      protected void verifyInputDeadLock() throws org.pentaho.di.core.exception.KettleStepException
      - A step sees that it can't get a new row from input in the step. - Then it verifies that there is more than one input row set and that at least one is full and at least one is empty. - Then it finds a step in the transformation (situated before the reader step) which has at least one full and one empty output row set. - If this situation presents itself and if it happens twice with the same rows read count (meaning: stalled reading step) we throw an exception. For the attached example that exception is:
      Throws:
      org.pentaho.di.core.exception.KettleStepException
    • findInputRowSet

      public org.pentaho.di.core.RowSet findInputRowSet(String sourceStep) throws org.pentaho.di.core.exception.KettleStepException
      Find input row set.
      Parameters:
      sourceStep - the source step
      Returns:
      the row set
      Throws:
      org.pentaho.di.core.exception.KettleStepException - the kettle step exception
    • findInputRowSet

      public org.pentaho.di.core.RowSet findInputRowSet(String from, int fromcopy, String to, int tocopy)
      Find input row set.
      Parameters:
      from - the from
      fromcopy - the fromcopy
      to - the to
      tocopy - the tocopy
      Returns:
      the row set
    • findOutputRowSet

      public org.pentaho.di.core.RowSet findOutputRowSet(String targetStep) throws org.pentaho.di.core.exception.KettleStepException
      Find output row set.
      Parameters:
      targetStep - the target step
      Returns:
      the row set
      Throws:
      org.pentaho.di.core.exception.KettleStepException - the kettle step exception
    • findOutputRowSet

      public org.pentaho.di.core.RowSet findOutputRowSet(String from, int fromcopy, String to, int tocopy)
      Find an output rowset in a running transformation. It will also look at the "to" step to see if this is a mapping. If it is, it will find the appropriate rowset in that transformation.
      Parameters:
      from -
      fromcopy -
      to -
      tocopy -
      Returns:
      The rowset or null if none is found.
    • setOutputDone

      public void setOutputDone()
      Description copied from interface: StepInterface
      Signal output done to destination steps
      Specified by:
      setOutputDone in interface StepInterface
    • dispatch

      public void dispatch()
      This method finds the surrounding steps and rowsets for this base step. This steps keeps it's own list of rowsets (etc.) to prevent it from having to search every time.

      Note that all rowsets input and output is already created by transformation itself. So in this place we will look and choose which rowsets will be used by this particular step.

      We will collect all input rowsets and output rowsets so step will be able to read input data, and write to the output.

      Steps can run in multiple copies, on in partitioned fashion. For this case we should take in account that in different cases we should take in account one to one, one to many and other cases properly.

    • isBasic

      public boolean isBasic()
      Checks if is basic.
      Returns:
      true, if is basic
    • isDetailed

      public boolean isDetailed()
      Checks if is detailed.
      Returns:
      true, if is detailed
    • isDebug

      public boolean isDebug()
      Checks if is debug.
      Returns:
      true, if is debug
    • isRowLevel

      public boolean isRowLevel()
      Checks if is row level.
      Returns:
      true, if is row level
    • logMinimal

      public void logMinimal(String message)
      Log minimal.
      Parameters:
      message - the message
    • logMinimal

      public void logMinimal(String message, Object... arguments)
      Log minimal.
      Parameters:
      message - the message
      arguments - the arguments
    • logBasic

      public void logBasic(String message)
      Log basic.
      Parameters:
      message - the message
    • logBasic

      public void logBasic(String message, Object... arguments)
      Log basic.
      Parameters:
      message - the message
      arguments - the arguments
    • logDetailed

      public void logDetailed(String message)
      Log detailed.
      Parameters:
      message - the message
    • logDetailed

      public void logDetailed(String message, Object... arguments)
      Log detailed.
      Parameters:
      message - the message
      arguments - the arguments
    • logDebug

      public void logDebug(String message)
      Log debug.
      Parameters:
      message - the message
    • logDebug

      public void logDebug(String message, Object... arguments)
      Log debug.
      Parameters:
      message - the message
      arguments - the arguments
    • logRowlevel

      public void logRowlevel(String message)
      Log rowlevel.
      Parameters:
      message - the message
    • logRowlevel

      public void logRowlevel(String message, Object... arguments)
      Log rowlevel.
      Parameters:
      message - the message
      arguments - the arguments
    • logError

      public void logError(String message)
      Log error.
      Parameters:
      message - the message
    • logError

      public void logError(String message, Throwable e)
      Log error.
      Parameters:
      message - the message
      e - the e
    • logError

      public void logError(String message, Object... arguments)
      Log error.
      Parameters:
      message - the message
      arguments - the arguments
    • getNextClassNr

      public int getNextClassNr()
      Gets the next class nr.
      Returns:
      the next class nr
    • outputIsDone

      public boolean outputIsDone()
      Output is done.
      Returns:
      true, if successful
    • stopAll

      public void stopAll()
      Description copied from interface: StepInterface
      Flags all rowsets as stopped/completed/finished.
      Specified by:
      stopAll in interface StepInterface
    • isStopped

      public boolean isStopped()
      Specified by:
      isStopped in interface StepInterface
      Returns:
      True if the step is marked as stopped. Execution should stop immediate.
    • isRunning

      public boolean isRunning()
      Specified by:
      isRunning in interface StepInterface
      Returns:
      true if the step is running after having been initialized
    • isPaused

      public boolean isPaused()
      Specified by:
      isPaused in interface StepInterface
      Returns:
      True if the step is paused
    • setStopped

      public void setStopped(boolean stopped)
      Specified by:
      setStopped in interface StepInterface
      Parameters:
      stopped - true if the step needs to be stopped
    • setSafeStopped

      public void setSafeStopped(boolean stopped)
      Specified by:
      setSafeStopped in interface StepInterface
      Parameters:
      stopped - true if the step needs to be safe stopped
    • isSafeStopped

      public boolean isSafeStopped()
      Specified by:
      isSafeStopped in interface StepInterface
      Returns:
      true if step is safe stopped.
    • setRunning

      public void setRunning(boolean running)
      Description copied from interface: StepInterface
      Flag the step as running or not
      Specified by:
      setRunning in interface StepInterface
      Parameters:
      running - the running flag to set
    • pauseRunning

      public void pauseRunning()
      Description copied from interface: StepInterface
      Pause a running step
      Specified by:
      pauseRunning in interface StepInterface
    • resumeRunning

      public void resumeRunning()
      Description copied from interface: StepInterface
      Resume a running step
      Specified by:
      resumeRunning in interface StepInterface
    • setPaused

      public void setPaused(boolean paused)
      Sets the paused.
      Parameters:
      paused - the new paused
    • setPaused

      public void setPaused(AtomicBoolean paused)
      Sets the paused.
      Parameters:
      paused - the new paused
    • isInitialising

      public boolean isInitialising()
      Checks if is initialising.
      Returns:
      true, if is initialising
    • markStart

      public void markStart()
      Description copied from interface: StepInterface
      Mark the start time of the step.
      Specified by:
      markStart in interface StepInterface
    • setInternalVariables

      public void setInternalVariables()
      Sets the internal variables.
    • markStop

      public void markStop()
      Description copied from interface: StepInterface
      Mark the end time of the step.
      Specified by:
      markStop in interface StepInterface
    • getRuntime

      public long getRuntime()
      Specified by:
      getRuntime in interface StepInterface
      Returns:
      The number of ms that this step has been running
    • buildLog

      public org.pentaho.di.core.RowMetaAndData buildLog(String sname, int copynr, long lines_read, long lines_written, long lines_updated, long lines_skipped, long errors, Date start_date, Date end_date)
      Builds the log.
      Parameters:
      sname - the sname
      copynr - the copynr
      lines_read - the lines_read
      lines_written - the lines_written
      lines_updated - the lines_updated
      lines_skipped - the lines_skipped
      errors - the errors
      start_date - the start_date
      end_date - the end_date
      Returns:
      the row meta and data
    • getLogFields

      public static final org.pentaho.di.core.row.RowMetaInterface getLogFields(String comm)
      Gets the log fields.
      Parameters:
      comm - the comm
      Returns:
      the log fields
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • rowsetOutputSize

      public int rowsetOutputSize()
      Specified by:
      rowsetOutputSize in interface StepInterface
      Returns:
      The total amount of rows in the output buffers
    • rowsetInputSize

      public int rowsetInputSize()
      Specified by:
      rowsetInputSize in interface StepInterface
      Returns:
      The total amount of rows in the input buffers
    • stopRunning

      public void stopRunning(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface) throws org.pentaho.di.core.exception.KettleException
      Perform actions to stop a running step. This can be stopping running SQL queries (cancel), etc. Default it doesn't do anything.
      Specified by:
      stopRunning in interface StepInterface
      Parameters:
      stepDataInterface - The interface to the step data containing the connections, resultsets, open files, etc.
      stepMetaInterface - The metadata that might be needed by the step to stop running.
      Throws:
      org.pentaho.di.core.exception.KettleException - in case something goes wrong
    • stopRunning

      @Deprecated public void stopRunning()
      Stops running operations This method is deprecated, please use the method specifying the metadata and data interfaces.
    • logSummary

      public void logSummary()
      Log summary.
    • getStepID

      public String getStepID()
      Specified by:
      getStepID in interface StepInterface
      Returns:
      the type ID of the step...
    • getInputRowSets

      public List<org.pentaho.di.core.RowSet> getInputRowSets()
      Specified by:
      getInputRowSets in interface StepInterface
      Returns:
      Returns the inputRowSets.
    • addRowSetToInputRowSets

      public void addRowSetToInputRowSets(org.pentaho.di.core.RowSet rowSet)
      Specified by:
      addRowSetToInputRowSets in interface StepInterface
    • getFirstInputRowSet

      protected org.pentaho.di.core.RowSet getFirstInputRowSet()
    • clearInputRowSets

      protected void clearInputRowSets()
    • swapFirstInputRowSetIfExists

      protected void swapFirstInputRowSetIfExists(String stepName)
    • setInputRowSets

      public void setInputRowSets(List<org.pentaho.di.core.RowSet> inputRowSets)
      Parameters:
      inputRowSets - The inputRowSets to set.
    • getOutputRowSets

      public List<org.pentaho.di.core.RowSet> getOutputRowSets()
      Specified by:
      getOutputRowSets in interface StepInterface
      Returns:
      Returns the outputRowSets.
    • addRowSetToOutputRowSets

      public void addRowSetToOutputRowSets(org.pentaho.di.core.RowSet rowSet)
      Specified by:
      addRowSetToOutputRowSets in interface StepInterface
    • clearOutputRowSets

      protected void clearOutputRowSets()
    • setOutputRowSets

      public void setOutputRowSets(List<org.pentaho.di.core.RowSet> outputRowSets)
      Parameters:
      outputRowSets - The outputRowSets to set.
    • isDistributed

      public boolean isDistributed()
      Returns:
      Returns the distributed.
    • setDistributed

      public void setDistributed(boolean distributed)
      Parameters:
      distributed - The distributed to set.
    • addRowListener

      public void addRowListener(RowListener rowListener)
      Description copied from interface: StepInterface
      Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.
      Specified by:
      addRowListener in interface StepInterface
      Parameters:
      rowListener - the rowlistener to add
    • removeRowListener

      public void removeRowListener(RowListener rowListener)
      Description copied from interface: StepInterface
      Remove a rowlistener from this step.
      Specified by:
      removeRowListener in interface StepInterface
      Parameters:
      rowListener - the rowlistener to remove
    • getRowListeners

      public List<RowListener> getRowListeners()
      Specified by:
      getRowListeners in interface StepInterface
      Returns:
      a list of the installed RowListeners
    • addResultFile

      public void addResultFile(org.pentaho.di.core.ResultFile resultFile)
      Adds the result file.
      Parameters:
      resultFile - the result file
    • getResultFiles

      public Map<String,org.pentaho.di.core.ResultFile> getResultFiles()
      Specified by:
      getResultFiles in interface StepInterface
      Returns:
      The result files for this step
    • getStatus

      Specified by:
      getStatus in interface StepInterface
      Returns:
      the description as in StepDataInterface
    • getPartitionID

      public String getPartitionID()
      Specified by:
      getPartitionID in interface StepInterface
      Returns:
      the partitionID
    • setPartitionID

      public void setPartitionID(String partitionID)
      Specified by:
      setPartitionID in interface StepInterface
      Parameters:
      partitionID - the partitionID to set
    • getPartitionTargets

      public Map<String,org.pentaho.di.core.BlockingRowSet> getPartitionTargets()
      Returns:
      the partitionTargets
    • setPartitionTargets

      public void setPartitionTargets(Map<String,org.pentaho.di.core.BlockingRowSet> partitionTargets)
      Parameters:
      partitionTargets - the partitionTargets to set
    • getRepartitioning

      public int getRepartitioning()
      Returns:
      the repartitioning type
    • setRepartitioning

      public void setRepartitioning(int repartitioning)
      Specified by:
      setRepartitioning in interface StepInterface
      Parameters:
      repartitioning - the repartitioning type to set
    • isPartitioned

      public boolean isPartitioned()
      Specified by:
      isPartitioned in interface StepInterface
      Returns:
      the partitioned
    • setPartitioned

      public void setPartitioned(boolean partitioned)
      Specified by:
      setPartitioned in interface StepInterface
      Parameters:
      partitioned - the partitioned to set
    • checkFeedback

      protected boolean checkFeedback(long lines)
      Check feedback.
      Parameters:
      lines - the lines
      Returns:
      true, if successful
    • getInputRowMeta

      public org.pentaho.di.core.row.RowMetaInterface getInputRowMeta()
      Returns:
      the rowMeta
    • setInputRowMeta

      public void setInputRowMeta(org.pentaho.di.core.row.RowMetaInterface rowMeta)
      Parameters:
      rowMeta - the rowMeta to set
    • getErrorRowMeta

      public org.pentaho.di.core.row.RowMetaInterface getErrorRowMeta()
      Returns:
      the errorRowMeta
    • setErrorRowMeta

      public void setErrorRowMeta(org.pentaho.di.core.row.RowMetaInterface errorRowMeta)
      Parameters:
      errorRowMeta - the errorRowMeta to set
    • getPreviewRowMeta

      public org.pentaho.di.core.row.RowMetaInterface getPreviewRowMeta()
      Returns:
      the previewRowMeta
    • setPreviewRowMeta

      public void setPreviewRowMeta(org.pentaho.di.core.row.RowMetaInterface previewRowMeta)
      Parameters:
      previewRowMeta - the previewRowMeta to set
    • copyVariablesFrom

      public void copyVariablesFrom(org.pentaho.di.core.variables.VariableSpace space)
      Specified by:
      copyVariablesFrom in interface org.pentaho.di.core.variables.VariableSpace
    • environmentSubstitute

      public String environmentSubstitute(String aString)
      Specified by:
      environmentSubstitute in interface org.pentaho.di.core.variables.VariableSpace
    • environmentSubstitute

      public String environmentSubstitute(String aString, boolean escapeHexDelimiter)
      Specified by:
      environmentSubstitute in interface org.pentaho.di.core.variables.VariableSpace
    • environmentSubstitute

      public String[] environmentSubstitute(String[] aString)
      Specified by:
      environmentSubstitute in interface org.pentaho.di.core.variables.VariableSpace
    • fieldSubstitute

      public String fieldSubstitute(String aString, org.pentaho.di.core.row.RowMetaInterface rowMeta, Object[] rowData) throws org.pentaho.di.core.exception.KettleValueException
      Specified by:
      fieldSubstitute in interface org.pentaho.di.core.variables.VariableSpace
      Throws:
      org.pentaho.di.core.exception.KettleValueException
    • getParentVariableSpace

      public org.pentaho.di.core.variables.VariableSpace getParentVariableSpace()
      Specified by:
      getParentVariableSpace in interface org.pentaho.di.core.variables.VariableSpace
    • setParentVariableSpace

      public void setParentVariableSpace(org.pentaho.di.core.variables.VariableSpace parent)
      Specified by:
      setParentVariableSpace in interface org.pentaho.di.core.variables.VariableSpace
    • getVariable

      public String getVariable(String variableName, String defaultValue)
      Specified by:
      getVariable in interface org.pentaho.di.core.variables.VariableSpace
    • getVariable

      public String getVariable(String variableName)
      Specified by:
      getVariable in interface org.pentaho.di.core.variables.VariableSpace
    • getBooleanValueOfVariable

      public boolean getBooleanValueOfVariable(String variableName, boolean defaultValue)
      Specified by:
      getBooleanValueOfVariable in interface org.pentaho.di.core.variables.VariableSpace
    • initializeVariablesFrom

      public void initializeVariablesFrom(org.pentaho.di.core.variables.VariableSpace parent)
      Specified by:
      initializeVariablesFrom in interface org.pentaho.di.core.variables.VariableSpace
    • listVariables

      public String[] listVariables()
      Specified by:
      listVariables in interface org.pentaho.di.core.variables.VariableSpace
    • setVariable

      public void setVariable(String variableName, String variableValue)
      Specified by:
      setVariable in interface org.pentaho.di.core.variables.VariableSpace
    • shareVariablesWith

      public void shareVariablesWith(org.pentaho.di.core.variables.VariableSpace space)
      Specified by:
      shareVariablesWith in interface org.pentaho.di.core.variables.VariableSpace
    • injectVariables

      public void injectVariables(Map<String,String> prop)
      Specified by:
      injectVariables in interface org.pentaho.di.core.variables.VariableSpace
    • getTypeId

      public String getTypeId()
      Returns the step ID via the getStepID() method call. Support for CheckResultSourceInterface.
      Returns:
      getStepID()
    • getSlaveNr

      public int getSlaveNr()
      Returns the unique slave number in the cluster.
      Returns:
      the unique slave number in the cluster
    • getClusterSize

      public int getClusterSize()
      Returns the cluster size.
      Returns:
      the cluster size
    • getUniqueStepNrAcrossSlaves

      public int getUniqueStepNrAcrossSlaves()
      Returns a unique step number across all slave servers: slaveNr * nrCopies + copyNr.
      Returns:
      a unique step number across all slave servers: slaveNr * nrCopies + copyNr
    • getUniqueStepCountAcrossSlaves

      public int getUniqueStepCountAcrossSlaves()
      Returns the number of unique steps across all slave servers.
      Returns:
      the number of unique steps across all slave servers
    • getServerSockets

      public List<ServerSocket> getServerSockets()
      Returns the serverSockets.
      Returns:
      the serverSockets
    • setServerSockets

      public void setServerSockets(List<ServerSocket> serverSockets)
      Parameters:
      serverSockets - the serverSockets to set
    • setUsingThreadPriorityManagment

      public void setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
      Set to true to actively manage priorities of step threads.
      Specified by:
      setUsingThreadPriorityManagment in interface StepInterface
      Parameters:
      usingThreadPriorityManagment - set to true to actively manage priorities of step threads
    • isUsingThreadPriorityManagment

      public boolean isUsingThreadPriorityManagment()
      Retusn true if we are actively managing priorities of step threads.
      Specified by:
      isUsingThreadPriorityManagment in interface StepInterface
      Returns:
      true if we are actively managing priorities of step threads
    • initBeforeStart

      public void initBeforeStart() throws org.pentaho.di.core.exception.KettleStepException
      This method is executed by Trans right before the threads start and right after initialization.

      More to the point: here we open remote output step sockets.

      Specified by:
      initBeforeStart in interface StepInterface
      Throws:
      org.pentaho.di.core.exception.KettleStepException - In case there is an error
    • getStepListeners

      public List<StepListener> getStepListeners()
      Returns the step listeners.
      Returns:
      the stepListeners
    • setStepListeners

      public void setStepListeners(List<StepListener> stepListeners)
      Sets the step listeners.
      Parameters:
      stepListeners - the stepListeners to set
    • processRow

      public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
      Description copied from interface: StepInterface
      Perform the equivalent of processing one row. Typically this means reading a row from input (getRow()) and passing a row to output (putRow)).
      Specified by:
      processRow in interface StepInterface
      Parameters:
      smi - The steps metadata to work with
      sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
      Returns:
      false if no more rows can be processed or an error occurred.
      Throws:
      org.pentaho.di.core.exception.KettleException
    • beforeStartProcessing

      public boolean beforeStartProcessing(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
      Description copied from interface: StepInterface
      This method is executed by Trans right before starting processing rows.
      Specified by:
      beforeStartProcessing in interface StepInterface
      Parameters:
      smi - The steps metadata to work with
      sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
      Throws:
      org.pentaho.di.core.exception.KettleException
    • canProcessOneRow

      public boolean canProcessOneRow()
      Description copied from interface: StepInterface
      This method checks if the step is capable of processing at least one row.

      For example, if a step has no input records but needs at least one to function, it will return false.

      Specified by:
      canProcessOneRow in interface StepInterface
      Returns:
      true if the step can process a row.
    • addStepListener

      public void addStepListener(StepListener stepListener)
      Description copied from interface: StepInterface
      Attach a step listener to be notified when a step arrives in a certain state. (finished)
      Specified by:
      addStepListener in interface StepInterface
      Parameters:
      stepListener - The listener to add to the step
    • isMapping

      public boolean isMapping()
      Specified by:
      isMapping in interface StepInterface
      Returns:
      true if the thread is a special mapping step
    • getSocketRepository

      public SocketRepository getSocketRepository()
      Retutns the socket repository.
      Returns:
      the socketRepository
    • setSocketRepository

      public void setSocketRepository(SocketRepository socketRepository)
      Sets the socket repository.
      Parameters:
      socketRepository - the socketRepository to set
    • getObjectName

      public String getObjectName()
      Specified by:
      getObjectName in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getLogChannel

      public org.pentaho.di.core.logging.LogChannelInterface getLogChannel()
      Specified by:
      getLogChannel in interface HasLogChannelInterface
      Specified by:
      getLogChannel in interface StepInterface
      Returns:
      the logging channel for this step
    • getFilename

      public String getFilename()
      Specified by:
      getFilename in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getLogChannelId

      public String getLogChannelId()
      Specified by:
      getLogChannelId in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getObjectId

      public org.pentaho.di.repository.ObjectId getObjectId()
      Specified by:
      getObjectId in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getObjectRevision

      public org.pentaho.di.repository.ObjectRevision getObjectRevision()
      Specified by:
      getObjectRevision in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getObjectType

      public org.pentaho.di.core.logging.LoggingObjectType getObjectType()
      Specified by:
      getObjectType in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getParent

      public org.pentaho.di.core.logging.LoggingObjectInterface getParent()
      Specified by:
      getParent in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getRepositoryDirectory

      public org.pentaho.di.repository.RepositoryDirectory getRepositoryDirectory()
      Specified by:
      getRepositoryDirectory in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getObjectCopy

      public String getObjectCopy()
      Specified by:
      getObjectCopy in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getLogLevel

      public org.pentaho.di.core.logging.LogLevel getLogLevel()
      Specified by:
      getLogLevel in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • setLogLevel

      public void setLogLevel(org.pentaho.di.core.logging.LogLevel logLevel)
      Sets the log level.
      Parameters:
      logLevel - the new log level
    • closeQuietly

      public static void closeQuietly(Closeable cl)
      Close quietly.
      Parameters:
      cl - the object that can be closed.
    • getContainerObjectId

      public String getContainerObjectId()
      Returns the container object ID.
      Specified by:
      getContainerObjectId in interface org.pentaho.di.core.logging.LoggingObjectInterface
      Returns:
      the containerObjectId
    • setCarteObjectId

      public void setCarteObjectId(String containerObjectId)
      Sets the container object ID.
      Parameters:
      containerObjectId - the containerObjectId to set
    • batchComplete

      public void batchComplete() throws org.pentaho.di.core.exception.KettleException
      Description copied from interface: StepInterface
      Calling this method will alert the step that we finished passing a batch of records to the step. Specifically for steps like "Sort Rows" it means that the buffered rows can be sorted and passed on.
      Specified by:
      batchComplete in interface StepInterface
      Throws:
      org.pentaho.di.core.exception.KettleException - In case an error occurs during the processing of the batch of rows.
    • getRemoteInputSteps

      public List<RemoteStep> getRemoteInputSteps()
      Gets the remote input steps.
      Returns:
      the remote input steps
    • getRemoteOutputSteps

      public List<RemoteStep> getRemoteOutputSteps()
      Gets the remote output steps.
      Returns:
      the remote output steps
    • getRegistrationDate

      public Date getRegistrationDate()
      Returns the registration date
      Specified by:
      getRegistrationDate in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • isGatheringMetrics

      public boolean isGatheringMetrics()
      Specified by:
      isGatheringMetrics in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • setGatheringMetrics

      public void setGatheringMetrics(boolean gatheringMetrics)
      Specified by:
      setGatheringMetrics in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • isForcingSeparateLogging

      public boolean isForcingSeparateLogging()
      Specified by:
      isForcingSeparateLogging in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • setForcingSeparateLogging

      public void setForcingSeparateLogging(boolean forcingSeparateLogging)
      Specified by:
      setForcingSeparateLogging in interface org.pentaho.di.core.logging.LoggingObjectInterface
    • getRepository

      public Repository getRepository()
      Specified by:
      getRepository in interface StepInterface
      Returns:
      The repository used by the step to load and reference Kettle objects with at runtime
    • setRepository

      public void setRepository(Repository repository)
      Specified by:
      setRepository in interface StepInterface
      Parameters:
      repository - The repository used by the step to load and reference Kettle objects with at runtime
    • getMetaStore

      public org.pentaho.metastore.api.IMetaStore getMetaStore()
      Specified by:
      getMetaStore in interface StepInterface
      Returns:
      The metastore that the step uses to load external elements from.
    • setMetaStore

      public void setMetaStore(org.pentaho.metastore.api.IMetaStore metaStore)
      Description copied from interface: StepInterface
      Pass along the metastore to use when loading external elements at runtime.
      Specified by:
      setMetaStore in interface StepInterface
      Parameters:
      metaStore - The metastore to use
    • getCurrentOutputRowSetNr

      public int getCurrentOutputRowSetNr()
      Specified by:
      getCurrentOutputRowSetNr in interface StepInterface
      Returns:
      the index of the active (current) output row set
    • setCurrentOutputRowSetNr

      public void setCurrentOutputRowSetNr(int index)
      Specified by:
      setCurrentOutputRowSetNr in interface StepInterface
      Parameters:
      index - Sets the index of the active (current) output row set to use.
    • getCurrentInputRowSetNr

      public int getCurrentInputRowSetNr()
      Specified by:
      getCurrentInputRowSetNr in interface StepInterface
      Returns:
      the index of the active (current) input row set
    • setCurrentInputRowSetNr

      public void setCurrentInputRowSetNr(int index)
      Specified by:
      setCurrentInputRowSetNr in interface StepInterface
      Parameters:
      index - Sets the index of the active (current) input row set to use.
    • getExtensionDataMap

      public Map<String,Object> getExtensionDataMap()
      Specified by:
      getExtensionDataMap in interface org.pentaho.di.core.ExtensionDataInterface