org.pentaho.di.trans.step
Interface StepInterface

All Superinterfaces:
HasLogChannelInterface, VariableSpace
All Known Subinterfaces:
ScriptInterface
All Known Implementing Classes:
Abort, AbstractStep, AccessInput, AccessOutput, AddSequence, AddXML, AggregateRows, AnalyticQuery, Append, AutoDoc, BaseStep, BlockingStep, BlockUntilStepsFinish, Calculator, ChangeFileEncoding, CheckSum, CloneRow, ClosureGenerator, ColumnExists, CombinationLookup, Constant, CreditCardValidator, CsvInput, CubeInput, CubeOutput, DatabaseJoin, DatabaseLookup, DataGrid, DBProc, Delay, Delete, Denormaliser, DetectEmptyStream, DetectLastRow, DimensionLookup, DummyTrans, DynamicSQLRow, ElasticSearchBulk, ExcelInput, ExcelOutput, ExcelWriterStep, ExecProcess, ExecSQL, ExecSQLRow, FieldsChangeSequence, FieldSplitter, FileExists, FileLocked, FilesFromResult, FilesToResult, FilterRows, FixedInput, Flattener, Formula, FuzzyMatch, GaInputStep, GetFileNames, GetFilesRowsCount, GetPreviousRowField, GetRepositoryNames, GetSlaveSequence, GetSubFolders, GetTableNames, GetVariable, GetXMLData, GPBulkLoader, GroupBy, HTTP, HTTPPOST, IfNull, InfobrightLoader, IngresVectorwiseLoader, Injector, InsertUpdate, Janino, JavaFilter, JoinRows, JsonInput, JsonOutput, LDAPInput, LDAPOutput, LDIFInput, LoadFileInput, LucidDBBulkLoader, LucidDBStreamingLoader, Mail, MailInput, MailValidator, Mapping, MappingInput, MappingOutput, MemoryGroupBy, MergeJoin, MergeRows, MetaInject, MondrianInput, MonetDBBulkLoader, MultiMergeJoin, MySQLBulkLoader, Normaliser, NullIf, NumberRange, OlapInput, OraBulkLoader, ParGzipCsvInput, PentahoReportingOutput, PGBulkLoader, PrioritizeStreams, ProcessFiles, PropertyInput, PropertyOutput, RandomCCNumberGenerator, RandomValue, RegexEval, ReplaceString, ReservoirSampling, Rest, RowGenerator, RowsFromResult, RowsToResult, RssInput, RssOutput, Rules, SalesforceDelete, SalesforceInput, SalesforceInsert, SalesforceUpdate, SalesforceUpsert, SampleRows, SapInput, Script, ScriptDummy, ScriptValuesMod, ScriptValuesModDummy, SecretKeyGenerator, SelectValues, SetValueConstant, SetValueField, SetVariable, SingleThreader, SocketReader, SocketWriter, SortedMerge, SortRows, SplitFieldToRows, SQLFileOutput, SSH, StepMetastructure, StepsMetrics, StreamLookup, StringCut, StringOperations, SwitchCase, SymmetricCryptoTrans, SynchronizeAfterMerge, SyslogMessage, SystemData, TableExists, TableInput, TableOutput, TeraFast, TextFileInput, TextFileOutput, UniqueRows, UniqueRowsByHashSet, UnivariateStats, Update, UserDefinedJavaClass, Validator, ValueMapper, WebService, WebServiceAvailable, WriteToLog, XBaseInput, XMLInput, XMLInputSax, XMLInputStream, XMLJoin, XMLOutput, XsdValidator, Xslt, YamlInput

public interface StepInterface
extends VariableSpace, HasLogChannelInterface

The interface that any transformation step or plugin needs to implement. Created on 12-AUG-2004

Author:
Matt

Method Summary
 void addRowListener(RowListener rowListener)
          Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.
 void addStepListener(StepListener stepListener)
          Attach a step listener to be notified when a step arrives in a certain state.
 void batchComplete()
          Calling this method will alert the step that we finished passing a batch of records to the step.
 boolean canProcessOneRow()
          This method checks if the step is capable of processing at least one row.
 void cleanup()
          Call this method typically, after ALL the slave transformations in a clustered run have finished.
 void dispose(StepMetaInterface sii, StepDataInterface sdi)
          Dispose of this step: close files, empty logs, etc.
 int getCopy()
           
 long getErrors()
          Get the number of errors
 List<RowSet> getInputRowSets()
           
 long getLinesInput()
           
 long getLinesOutput()
           
 long getLinesRead()
           
 long getLinesRejected()
           
 long getLinesUpdated()
           
 long getLinesWritten()
           
 LogChannelInterface getLogChannel()
           
 List<RowSet> getOutputRowSets()
           
 String getPartitionID()
           
 long getProcessed()
           
 Map<String,ResultFile> getResultFiles()
           
 Object[] getRow()
           
 List<RowListener> getRowListeners()
           
 long getRuntime()
           
 BaseStepData.StepExecutionStatus getStatus()
           
 String getStepID()
           
 StepMeta getStepMeta()
           
 String getStepname()
          Get the name of the step.
 Trans getTrans()
           
 void identifyErrorOutput()
          To be used to flag an error output channel of a step prior to execution for performance reasons.
 boolean init(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
          Initialize and do work where other steps need to wait for...
 void initBeforeStart()
          This method is executed by Trans right before the threads start and right after initialization.

!!! A plugin implementing this method should make sure to also call super.initBeforeStart(); !!!
 boolean isMapping()
           
 boolean isPartitioned()
           
 boolean isPaused()
           
 boolean isRunning()
           
 boolean isStopped()
           
 boolean isUsingThreadPriorityManagment()
           
 void markStart()
          Mark the start time of the step.
 void markStop()
          Mark the end time of the step.
 void pauseRunning()
          Pause a running step
 boolean processRow(StepMetaInterface smi, StepDataInterface sdi)
          Perform the equivalent of processing one row.
 void putRow(RowMetaInterface row, Object[] data)
          Put a row on the destination rowsets.
 void removeRowListener(RowListener rowListener)
          Remove a rowlistener from this step.
 void resumeRunning()
          Resume a running step
 int rowsetInputSize()
           
 int rowsetOutputSize()
           
 void setErrors(long errors)
          Sets the number of errors
 void setLinesRejected(long linesRejected)
           
 void setOutputDone()
          Signal output done to destination steps
 void setPartitioned(boolean partitioned)
           
 void setPartitionID(String partitionID)
           
 void setRepartitioning(int partitioningMethod)
           
 void setRunning(boolean running)
          Flag the step as running or not
 void setStopped(boolean stopped)
           
 void setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
           
 void stopAll()
          Flags all rowsets as stopped/completed/finished.
 void stopRunning(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
          Stop running operations...
 
Methods inherited from interface org.pentaho.di.core.variables.VariableSpace
copyVariablesFrom, environmentSubstitute, environmentSubstitute, getBooleanValueOfVariable, getParentVariableSpace, getVariable, getVariable, initializeVariablesFrom, injectVariables, listVariables, setParentVariableSpace, setVariable, shareVariablesWith
 

Method Detail

getTrans

Trans getTrans()
Returns:
the transformation that is executing this step

processRow

boolean processRow(StepMetaInterface smi,
                   StepDataInterface sdi)
                   throws KettleException
Perform the equivalent of processing one row. Typically this means reading a row from input (getRow()) and passing a row to output (putRow)).

Parameters:
smi - The steps metadata to work with
sdi - The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)
Returns:
false if no more rows can be processed or an error occurred.
Throws:
KettleException

canProcessOneRow

boolean canProcessOneRow()
This method checks if the step is capable of processing at least one row.

For example, if a step has no input records but needs at least one to function, it will return false.

Returns:
true if the step can process a row.

init

boolean init(StepMetaInterface stepMetaInterface,
             StepDataInterface stepDataInterface)
Initialize and do work where other steps need to wait for...

Parameters:
stepMetaInterface - The metadata to work with
stepDataInterface - The data to initialize

dispose

void dispose(StepMetaInterface sii,
             StepDataInterface sdi)
Dispose of this step: close files, empty logs, etc.

Parameters:
sii - The metadata to work with
sdi - The data to dispose of

markStart

void markStart()
Mark the start time of the step.


markStop

void markStop()
Mark the end time of the step.


stopRunning

void stopRunning(StepMetaInterface stepMetaInterface,
                 StepDataInterface stepDataInterface)
                 throws KettleException
Stop running operations...

Parameters:
stepMetaInterface - The metadata that might be needed by the step to stop running.
stepDataInterface - The interface to the step data containing the connections, resultsets, open files, etc.
Throws:
KettleException

isRunning

boolean isRunning()
Returns:
true if the step is running after having been initialized

setRunning

void setRunning(boolean running)
Flag the step as running or not

Parameters:
running - the running flag to set

isStopped

boolean isStopped()
Returns:
True if the step is marked as stopped. Execution should stop immediate.

setStopped

void setStopped(boolean stopped)
Parameters:
stopped - true if the step needs to be stopped

isPaused

boolean isPaused()
Returns:
True if the step is paused

stopAll

void stopAll()
Flags all rowsets as stopped/completed/finished.


pauseRunning

void pauseRunning()
Pause a running step


resumeRunning

void resumeRunning()
Resume a running step


getStepname

String getStepname()
Get the name of the step.

Returns:
the name of the step

getCopy

int getCopy()
Returns:
The steps copy number (default 0)

getStepID

String getStepID()
Returns:
the type ID of the step...

getErrors

long getErrors()
Get the number of errors

Returns:
the number of errors

setErrors

void setErrors(long errors)
Sets the number of errors

Parameters:
errors - the number of errors to set

getLinesInput

long getLinesInput()
Returns:
Returns the linesInput.

getLinesOutput

long getLinesOutput()
Returns:
Returns the linesOutput.

getLinesRead

long getLinesRead()
Returns:
Returns the linesRead.

getLinesWritten

long getLinesWritten()
Returns:
Returns the linesWritten.

getLinesUpdated

long getLinesUpdated()
Returns:
Returns the linesUpdated.

setLinesRejected

void setLinesRejected(long linesRejected)
Parameters:
linesRejected - steps the lines rejected by error handling.

getLinesRejected

long getLinesRejected()
Returns:
Returns the lines rejected by error handling.

putRow

void putRow(RowMetaInterface row,
            Object[] data)
            throws KettleException
Put a row on the destination rowsets.

Parameters:
row - The row to send to the destinations steps
Throws:
KettleException

getRow

Object[] getRow()
                throws KettleException
Returns:
a row from the source step(s).
Throws:
KettleException

setOutputDone

void setOutputDone()
Signal output done to destination steps


addRowListener

void addRowListener(RowListener rowListener)
Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.

Parameters:
rowListener - the rowlistener to add

removeRowListener

void removeRowListener(RowListener rowListener)
Remove a rowlistener from this step.

Parameters:
rowListener - the rowlistener to remove

getRowListeners

List<RowListener> getRowListeners()
Returns:
a list of the installed RowListeners

getInputRowSets

List<RowSet> getInputRowSets()
Returns:
The list of active input rowsets for the step

getOutputRowSets

List<RowSet> getOutputRowSets()
Returns:
The list of active output rowsets for the step

isPartitioned

boolean isPartitioned()
Returns:
true if the step is running partitioned

setPartitionID

void setPartitionID(String partitionID)
Parameters:
partitionID - the partitionID to set

getPartitionID

String getPartitionID()
Returns:
the steps partition ID

cleanup

void cleanup()
Call this method typically, after ALL the slave transformations in a clustered run have finished.


initBeforeStart

void initBeforeStart()
                     throws KettleStepException
This method is executed by Trans right before the threads start and right after initialization.

!!! A plugin implementing this method should make sure to also call super.initBeforeStart(); !!!

Throws:
KettleStepException - In case there is an error

addStepListener

void addStepListener(StepListener stepListener)
Attach a step listener to be notified when a step arrives in a certain state. (finished)

Parameters:
stepListener - The listener to add to the step

isMapping

boolean isMapping()
Returns:
true if the thread is a special mapping step

getStepMeta

StepMeta getStepMeta()
Returns:
The metadata for this step

getLogChannel

LogChannelInterface getLogChannel()
Specified by:
getLogChannel in interface HasLogChannelInterface
Returns:
the logging channel for this step

setUsingThreadPriorityManagment

void setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
Parameters:
usingThreadPriorityManagment - set to true to actively manage priorities of step threads

isUsingThreadPriorityManagment

boolean isUsingThreadPriorityManagment()
Returns:
true if we are actively managing priorities of step threads

rowsetInputSize

int rowsetInputSize()
Returns:
The total amount of rows in the input buffers

rowsetOutputSize

int rowsetOutputSize()
Returns:
The total amount of rows in the output buffers

getProcessed

long getProcessed()
Returns:
The number of "processed" lines of a step. Well, a representable metric for that anyway.

getResultFiles

Map<String,ResultFile> getResultFiles()
Returns:
The result files for this step

getStatus

BaseStepData.StepExecutionStatus getStatus()
Returns:
the description as in StepDataInterface

getRuntime

long getRuntime()
Returns:
The number of ms that this step has been running

identifyErrorOutput

void identifyErrorOutput()
To be used to flag an error output channel of a step prior to execution for performance reasons.


setPartitioned

void setPartitioned(boolean partitioned)
Parameters:
partitioned - true if this step is partitioned

setRepartitioning

void setRepartitioning(int partitioningMethod)
Parameters:
partitioningMethodNone - The repartitioning method

batchComplete

void batchComplete()
                   throws KettleException
Calling this method will alert the step that we finished passing a batch of records to the step. Specifically for steps like "Sort Rows" it means that the buffered rows can be sorted and passed on.

Throws:
KettleException - In case an error occurs during the processing of the batch of rows.