Interface StepInterface

All Superinterfaces:
HasLogChannelInterface, org.pentaho.di.core.variables.VariableSpace
All Known Subinterfaces:
ScriptInterface
All Known Implementing Classes:
AbstractStep, BaseDatabaseStep, BaseFileInputStep, BaseStep, BaseStreamStep, Calculator, Constant, CsvInput, DatabaseJoin, DatabaseLookup, DataGrid, DBProc, Delete, Denormaliser, DetectEmptyStream, DetectLastRow, DimensionLookup, DummyTrans, DynamicSQLRow, ExecProcess, ExecSQL, ExecSQLRow, FieldsChangeSequence, FieldSplitter, FileExists, FileLocked, FilesFromResult, FilesToResult, FilterRows, FixedInput, Flattener, Formula, FuzzyMatch, GetFileNames, GetFilesRowsCount, GetSlaveSequence, GetSubFolders, GetTableNames, GetVariable, GroupBy, HTTP, HTTPPOST, IfNull, Injector, InsertUpdate, Janino, JavaFilter, JobExecutor, JoinRows, LDIFInput, LoadFileInput, Mapping, MappingInput, MappingOutput, MemoryGroupBy, MergeJoin, MergeRows, MissingTransStep, MultiMergeJoin, Normaliser, NullIf, NumberRange, OlapInput, ParGzipCsvInput, PGPDecryptStream, PGPEncryptStream, PrioritizeStreams, ProcessFiles, PropertyInput, PropertyOutput, RandomValue, RecordsFromStream, RegexEval, ReplaceString, ReservoirSampling, RowGenerator, RowsFromResult, RowsToResult, SampleRows, SasInput, Script, ScriptDummy, ScriptValuesMod, ScriptValuesModDummy, SecretKeyGenerator, SelectValues, SetValueConstant, SetValueField, SetVariable, SimpleMapping, SingleThreader, SocketReader, SocketWriter, SortedMerge, SortRows, SplitFieldToRows, SQLFileOutput, SSH, StepMetastructure, StepsMetrics, StreamLookup, StringCut, StringOperations, SwitchCase, SymmetricCryptoTrans, SynchronizeAfterMerge, SyslogMessage, SystemData, TableCompare, TableExists, TableInput, TableOutput, TextFileInput, TextFileInput, TextFileOutput, TextFileOutputLegacy, TransExecutor, UniqueRows, UniqueRowsByHashSet, UnivariateStats, Update, UserDefinedJavaClass, Validator, ValueMapper, WebService, WebServiceAvailable, WriteToLog, XBaseInput, ZipFile

public interface StepInterface extends org.pentaho.di.core.variables.VariableSpace, HasLogChannelInterface
The interface that any transformation step or plugin needs to implement. Created on 12-AUG-2004
Author:
Matt
  • Method Details

    • getTrans

      Trans getTrans()
      Returns:
      the transformation that is executing this step
    • processRow

      boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
      Perform the equivalent of processing one row. Typically this means reading a row from input (getRow()) and passing a row to output (putRow)).
      Parameters:
      smi - The steps metadata to work with
      sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
      Returns:
      false if no more rows can be processed or an error occurred.
      Throws:
      org.pentaho.di.core.exception.KettleException
    • beforeStartProcessing

      default boolean beforeStartProcessing(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
      This method is executed by Trans right before starting processing rows.
      Parameters:
      smi - The steps metadata to work with
      sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
      Throws:
      org.pentaho.di.core.exception.KettleException
    • afterFinishProcessing

      default boolean afterFinishProcessing(StepMetaInterface smi, StepDataInterface sdi)
      This method is executed by Trans after finishing processing rows.
      Parameters:
      smi - The steps metadata to work with
      sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
      Throws:
      org.pentaho.di.core.exception.KettleException
    • canProcessOneRow

      boolean canProcessOneRow()
      This method checks if the step is capable of processing at least one row.

      For example, if a step has no input records but needs at least one to function, it will return false.

      Returns:
      true if the step can process a row.
    • init

      boolean init(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
      Initialize and do work where other steps need to wait for...
      Parameters:
      stepMetaInterface - The metadata to work with
      stepDataInterface - The data to initialize
    • dispose

      void dispose(StepMetaInterface sii, StepDataInterface sdi)
      Dispose of this step: close files, empty logs, etc.
      Parameters:
      sii - The metadata to work with
      sdi - The data to dispose of
    • markStart

      void markStart()
      Mark the start time of the step.
    • markStop

      void markStop()
      Mark the end time of the step.
    • stopRunning

      void stopRunning(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface) throws org.pentaho.di.core.exception.KettleException
      Stop running operations...
      Parameters:
      stepMetaInterface - The metadata that might be needed by the step to stop running.
      stepDataInterface - The interface to the step data containing the connections, resultsets, open files, etc.
      Throws:
      org.pentaho.di.core.exception.KettleException
    • isRunning

      boolean isRunning()
      Returns:
      true if the step is running after having been initialized
    • setRunning

      void setRunning(boolean running)
      Flag the step as running or not
      Parameters:
      running - the running flag to set
    • isStopped

      boolean isStopped()
      Returns:
      True if the step is marked as stopped. Execution should stop immediate.
    • setStopped

      void setStopped(boolean stopped)
      Parameters:
      stopped - true if the step needs to be stopped
    • setSafeStopped

      default void setSafeStopped(boolean stopped)
      Parameters:
      stopped - true if the step needs to be safe stopped
    • isSafeStopped

      default boolean isSafeStopped()
      Returns:
      true if step is safe stopped.
    • isPaused

      boolean isPaused()
      Returns:
      True if the step is paused
    • stopAll

      void stopAll()
      Flags all rowsets as stopped/completed/finished.
    • pauseRunning

      void pauseRunning()
      Pause a running step
    • resumeRunning

      void resumeRunning()
      Resume a running step
    • getStepname

      String getStepname()
      Get the name of the step.
      Returns:
      the name of the step
    • getCopy

      int getCopy()
      Returns:
      The steps copy number (default 0)
    • getStepID

      String getStepID()
      Returns:
      the type ID of the step...
    • getErrors

      long getErrors()
      Get the number of errors
      Returns:
      the number of errors
    • setErrors

      void setErrors(long errors)
      Sets the number of errors
      Parameters:
      errors - the number of errors to set
    • getLinesInput

      long getLinesInput()
      Returns:
      Returns the linesInput.
    • getLinesOutput

      long getLinesOutput()
      Returns:
      Returns the linesOutput.
    • getLinesRead

      long getLinesRead()
      Returns:
      Returns the linesRead.
    • getLinesWritten

      long getLinesWritten()
      Returns:
      Returns the linesWritten.
    • getLinesUpdated

      long getLinesUpdated()
      Returns:
      Returns the linesUpdated.
    • setLinesRejected

      void setLinesRejected(long linesRejected)
      Parameters:
      linesRejected - steps the lines rejected by error handling.
    • getLinesRejected

      long getLinesRejected()
      Returns:
      Returns the lines rejected by error handling.
    • putRow

      void putRow(org.pentaho.di.core.row.RowMetaInterface row, Object[] data) throws org.pentaho.di.core.exception.KettleException
      Put a row on the destination rowsets.
      Parameters:
      row - The row to send to the destinations steps
      Throws:
      org.pentaho.di.core.exception.KettleException
    • getRow

      Object[] getRow() throws org.pentaho.di.core.exception.KettleException
      Returns:
      a row from the source step(s).
      Throws:
      org.pentaho.di.core.exception.KettleException
    • setOutputDone

      void setOutputDone()
      Signal output done to destination steps
    • addRowListener

      void addRowListener(RowListener rowListener)
      Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.
      Parameters:
      rowListener - the rowlistener to add
    • removeRowListener

      void removeRowListener(RowListener rowListener)
      Remove a rowlistener from this step.
      Parameters:
      rowListener - the rowlistener to remove
    • getRowListeners

      List<RowListener> getRowListeners()
      Returns:
      a list of the installed RowListeners
    • getInputRowSets

      List<org.pentaho.di.core.RowSet> getInputRowSets()
      Returns:
      The list of active input rowsets for the step
    • getOutputRowSets

      List<org.pentaho.di.core.RowSet> getOutputRowSets()
      Returns:
      The list of active output rowsets for the step
    • isPartitioned

      boolean isPartitioned()
      Returns:
      true if the step is running partitioned
    • setPartitionID

      void setPartitionID(String partitionID)
      Parameters:
      partitionID - the partitionID to set
    • getPartitionID

      String getPartitionID()
      Returns:
      the steps partition ID
    • cleanup

      void cleanup()
      Call this method typically, after ALL the slave transformations in a clustered run have finished.
    • initBeforeStart

      void initBeforeStart() throws org.pentaho.di.core.exception.KettleStepException
      This method is executed by Trans right before the threads start and right after initialization.

      !!! A plugin implementing this method should make sure to also call super.initBeforeStart(); !!!
      Throws:
      org.pentaho.di.core.exception.KettleStepException - In case there is an error
    • addStepListener

      void addStepListener(StepListener stepListener)
      Attach a step listener to be notified when a step arrives in a certain state. (finished)
      Parameters:
      stepListener - The listener to add to the step
    • isMapping

      boolean isMapping()
      Returns:
      true if the thread is a special mapping step
    • getStepMeta

      StepMeta getStepMeta()
      Returns:
      The metadata for this step
    • getLogChannel

      org.pentaho.di.core.logging.LogChannelInterface getLogChannel()
      Specified by:
      getLogChannel in interface HasLogChannelInterface
      Returns:
      the logging channel for this step
    • setUsingThreadPriorityManagment

      void setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
      Parameters:
      usingThreadPriorityManagment - set to true to actively manage priorities of step threads
    • isUsingThreadPriorityManagment

      boolean isUsingThreadPriorityManagment()
      Returns:
      true if we are actively managing priorities of step threads
    • rowsetInputSize

      int rowsetInputSize()
      Returns:
      The total amount of rows in the input buffers
    • rowsetOutputSize

      int rowsetOutputSize()
      Returns:
      The total amount of rows in the output buffers
    • getProcessed

      long getProcessed()
      Returns:
      The number of "processed" lines of a step. Well, a representable metric for that anyway.
    • getResultFiles

      Map<String,org.pentaho.di.core.ResultFile> getResultFiles()
      Returns:
      The result files for this step
    • getStatus

      Returns:
      the description as in StepDataInterface
    • getRuntime

      long getRuntime()
      Returns:
      The number of ms that this step has been running
    • identifyErrorOutput

      void identifyErrorOutput()
      To be used to flag an error output channel of a step prior to execution for performance reasons.
    • setPartitioned

      void setPartitioned(boolean partitioned)
      Parameters:
      partitioned - true if this step is partitioned
    • setRepartitioning

      void setRepartitioning(int partitioningMethod)
      Parameters:
      partitioningMethod - The repartitioning method
    • batchComplete

      void batchComplete() throws org.pentaho.di.core.exception.KettleException
      Calling this method will alert the step that we finished passing a batch of records to the step. Specifically for steps like "Sort Rows" it means that the buffered rows can be sorted and passed on.
      Throws:
      org.pentaho.di.core.exception.KettleException - In case an error occurs during the processing of the batch of rows.
    • setMetaStore

      void setMetaStore(org.pentaho.metastore.api.IMetaStore metaStore)
      Pass along the metastore to use when loading external elements at runtime.
      Parameters:
      metaStore - The metastore to use
    • getMetaStore

      org.pentaho.metastore.api.IMetaStore getMetaStore()
      Returns:
      The metastore that the step uses to load external elements from.
    • setRepository

      void setRepository(Repository repository)
      Parameters:
      repository - The repository used by the step to load and reference Kettle objects with at runtime
    • getRepository

      Repository getRepository()
      Returns:
      The repository used by the step to load and reference Kettle objects with at runtime
    • getCurrentOutputRowSetNr

      int getCurrentOutputRowSetNr()
      Returns:
      the index of the active (current) output row set
    • setCurrentOutputRowSetNr

      void setCurrentOutputRowSetNr(int index)
      Parameters:
      index - Sets the index of the active (current) output row set to use.
    • getCurrentInputRowSetNr

      int getCurrentInputRowSetNr()
      Returns:
      the index of the active (current) input row set
    • setCurrentInputRowSetNr

      void setCurrentInputRowSetNr(int index)
      Parameters:
      index - Sets the index of the active (current) input row set to use.
    • subStatuses

      default Collection<StepStatus> subStatuses()
    • addRowSetToInputRowSets

      default void addRowSetToInputRowSets(org.pentaho.di.core.RowSet rowSet)
    • addRowSetToOutputRowSets

      default void addRowSetToOutputRowSets(org.pentaho.di.core.RowSet rowSet)