Package org.pentaho.di.trans.step
Interface StepInterface
-
- All Superinterfaces:
HasLogChannelInterface
,org.pentaho.di.core.variables.VariableSpace
- All Known Subinterfaces:
ScriptInterface
- All Known Implementing Classes:
AbstractStep
,BaseFileInputStep
,BaseStep
,BaseStreamStep
,Calculator
,Constant
,CreditCardValidator
,CsvInput
,DatabaseJoin
,DatabaseLookup
,DataGrid
,DBProc
,Delete
,Denormaliser
,DetectEmptyStream
,DetectLastRow
,DimensionLookup
,DummyTrans
,DynamicSQLRow
,ExecProcess
,ExecSQL
,ExecSQLRow
,FieldsChangeSequence
,FieldSplitter
,FileExists
,FileLocked
,FilesFromResult
,FilesToResult
,FilterRows
,FixedInput
,Flattener
,Formula
,FuzzyMatch
,GetFileNames
,GetFilesRowsCount
,GetRepositoryNames
,GetSlaveSequence
,GetSubFolders
,GetTableNames
,GetVariable
,GroupBy
,HTTP
,HTTPPOST
,IfNull
,Injector
,InsertUpdate
,Janino
,JavaFilter
,JobExecutor
,JoinRows
,LDIFInput
,LoadFileInput
,Mail
,MailInput
,MailValidator
,Mapping
,MappingInput
,MappingOutput
,MemoryGroupBy
,MergeJoin
,MergeRows
,MissingTransStep
,MultiMergeJoin
,Normaliser
,NullIf
,NumberRange
,OlapInput
,ParGzipCsvInput
,PGPDecryptStream
,PGPEncryptStream
,PrioritizeStreams
,ProcessFiles
,PropertyInput
,PropertyOutput
,RandomCCNumberGenerator
,RandomValue
,RecordsFromStream
,RegexEval
,ReplaceString
,ReservoirSampling
,Rest
,RowGenerator
,RowsFromResult
,RowsToResult
,SampleRows
,SasInput
,Script
,ScriptDummy
,ScriptValuesMod
,ScriptValuesModDummy
,SecretKeyGenerator
,SelectValues
,SetValueConstant
,SetValueField
,SetVariable
,SFTPPut
,SimpleMapping
,SingleThreader
,SocketReader
,SocketWriter
,SortedMerge
,SortRows
,SplitFieldToRows
,SQLFileOutput
,SSH
,StepMetastructure
,StepsMetrics
,StreamLookup
,StringCut
,StringOperations
,SwitchCase
,SymmetricCryptoTrans
,SynchronizeAfterMerge
,SyslogMessage
,SystemData
,TableCompare
,TableExists
,TableInput
,TableOutput
,TextFileInput
,TextFileInput
,TextFileOutput
,TextFileOutputLegacy
,TransExecutor
,UniqueRows
,UniqueRowsByHashSet
,UnivariateStats
,Update
,UserDefinedJavaClass
,Validator
,ValueMapper
,WebService
,WebServiceAvailable
,WriteToLog
,XBaseInput
,ZipFile
public interface StepInterface extends org.pentaho.di.core.variables.VariableSpace, HasLogChannelInterface
The interface that any transformation step or plugin needs to implement. Created on 12-AUG-2004- Author:
- Matt
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description void
addRowListener(RowListener rowListener)
Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.default void
addRowSetToInputRowSets(org.pentaho.di.core.RowSet rowSet)
default void
addRowSetToOutputRowSets(org.pentaho.di.core.RowSet rowSet)
void
addStepListener(StepListener stepListener)
Attach a step listener to be notified when a step arrives in a certain state.void
batchComplete()
Calling this method will alert the step that we finished passing a batch of records to the step.boolean
canProcessOneRow()
This method checks if the step is capable of processing at least one row.void
cleanup()
Call this method typically, after ALL the slave transformations in a clustered run have finished.void
dispose(StepMetaInterface sii, StepDataInterface sdi)
Dispose of this step: close files, empty logs, etc.int
getCopy()
int
getCurrentInputRowSetNr()
int
getCurrentOutputRowSetNr()
long
getErrors()
Get the number of errorsList<org.pentaho.di.core.RowSet>
getInputRowSets()
long
getLinesInput()
long
getLinesOutput()
long
getLinesRead()
long
getLinesRejected()
long
getLinesUpdated()
long
getLinesWritten()
org.pentaho.di.core.logging.LogChannelInterface
getLogChannel()
org.pentaho.metastore.api.IMetaStore
getMetaStore()
List<org.pentaho.di.core.RowSet>
getOutputRowSets()
String
getPartitionID()
long
getProcessed()
Repository
getRepository()
Map<String,org.pentaho.di.core.ResultFile>
getResultFiles()
Object[]
getRow()
List<RowListener>
getRowListeners()
long
getRuntime()
BaseStepData.StepExecutionStatus
getStatus()
String
getStepID()
StepMeta
getStepMeta()
String
getStepname()
Get the name of the step.Trans
getTrans()
void
identifyErrorOutput()
To be used to flag an error output channel of a step prior to execution for performance reasons.boolean
init(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
Initialize and do work where other steps need to wait for...void
initBeforeStart()
This method is executed by Trans right before the threads start and right after initialization.
!!! A plugin implementing this method should make sure to also call super.initBeforeStart(); !!!boolean
isMapping()
boolean
isPartitioned()
boolean
isPaused()
boolean
isRunning()
default boolean
isSafeStopped()
boolean
isStopped()
boolean
isUsingThreadPriorityManagment()
void
markStart()
Mark the start time of the step.void
markStop()
Mark the end time of the step.void
pauseRunning()
Pause a running stepboolean
processRow(StepMetaInterface smi, StepDataInterface sdi)
Perform the equivalent of processing one row.void
putRow(org.pentaho.di.core.row.RowMetaInterface row, Object[] data)
Put a row on the destination rowsets.void
removeRowListener(RowListener rowListener)
Remove a rowlistener from this step.void
resumeRunning()
Resume a running stepint
rowsetInputSize()
int
rowsetOutputSize()
void
setCurrentInputRowSetNr(int index)
void
setCurrentOutputRowSetNr(int index)
void
setErrors(long errors)
Sets the number of errorsvoid
setLinesRejected(long linesRejected)
void
setMetaStore(org.pentaho.metastore.api.IMetaStore metaStore)
Pass along the metastore to use when loading external elements at runtime.void
setOutputDone()
Signal output done to destination stepsvoid
setPartitioned(boolean partitioned)
void
setPartitionID(String partitionID)
void
setRepartitioning(int partitioningMethod)
void
setRepository(Repository repository)
void
setRunning(boolean running)
Flag the step as running or notdefault void
setSafeStopped(boolean stopped)
void
setStopped(boolean stopped)
void
setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
void
stopAll()
Flags all rowsets as stopped/completed/finished.void
stopRunning(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
Stop running operations...default Collection<StepStatus>
subStatuses()
-
Methods inherited from interface org.pentaho.di.core.variables.VariableSpace
copyVariablesFrom, environmentSubstitute, environmentSubstitute, environmentSubstitute, fieldSubstitute, getBooleanValueOfVariable, getParentVariableSpace, getVariable, getVariable, initializeVariablesFrom, injectVariables, listVariables, setParentVariableSpace, setVariable, shareVariablesWith
-
-
-
-
Method Detail
-
getTrans
Trans getTrans()
- Returns:
- the transformation that is executing this step
-
processRow
boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
Perform the equivalent of processing one row. Typically this means reading a row from input (getRow()) and passing a row to output (putRow)).- Parameters:
smi
- The steps metadata to work withsdi
- The steps temporary working data to work with (database connections, result sets, caches, temporary variables, etc.)- Returns:
- false if no more rows can be processed or an error occurred.
- Throws:
org.pentaho.di.core.exception.KettleException
-
canProcessOneRow
boolean canProcessOneRow()
This method checks if the step is capable of processing at least one row.For example, if a step has no input records but needs at least one to function, it will return false.
- Returns:
- true if the step can process a row.
-
init
boolean init(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)
Initialize and do work where other steps need to wait for...- Parameters:
stepMetaInterface
- The metadata to work withstepDataInterface
- The data to initialize
-
dispose
void dispose(StepMetaInterface sii, StepDataInterface sdi)
Dispose of this step: close files, empty logs, etc.- Parameters:
sii
- The metadata to work withsdi
- The data to dispose of
-
markStart
void markStart()
Mark the start time of the step.
-
markStop
void markStop()
Mark the end time of the step.
-
stopRunning
void stopRunning(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface) throws org.pentaho.di.core.exception.KettleException
Stop running operations...- Parameters:
stepMetaInterface
- The metadata that might be needed by the step to stop running.stepDataInterface
- The interface to the step data containing the connections, resultsets, open files, etc.- Throws:
org.pentaho.di.core.exception.KettleException
-
isRunning
boolean isRunning()
- Returns:
- true if the step is running after having been initialized
-
setRunning
void setRunning(boolean running)
Flag the step as running or not- Parameters:
running
- the running flag to set
-
isStopped
boolean isStopped()
- Returns:
- True if the step is marked as stopped. Execution should stop immediate.
-
setStopped
void setStopped(boolean stopped)
- Parameters:
stopped
- true if the step needs to be stopped
-
setSafeStopped
default void setSafeStopped(boolean stopped)
- Parameters:
stopped
- true if the step needs to be safe stopped
-
isSafeStopped
default boolean isSafeStopped()
- Returns:
- true if step is safe stopped.
-
isPaused
boolean isPaused()
- Returns:
- True if the step is paused
-
stopAll
void stopAll()
Flags all rowsets as stopped/completed/finished.
-
pauseRunning
void pauseRunning()
Pause a running step
-
resumeRunning
void resumeRunning()
Resume a running step
-
getStepname
String getStepname()
Get the name of the step.- Returns:
- the name of the step
-
getCopy
int getCopy()
- Returns:
- The steps copy number (default 0)
-
getStepID
String getStepID()
- Returns:
- the type ID of the step...
-
getErrors
long getErrors()
Get the number of errors- Returns:
- the number of errors
-
setErrors
void setErrors(long errors)
Sets the number of errors- Parameters:
errors
- the number of errors to set
-
getLinesInput
long getLinesInput()
- Returns:
- Returns the linesInput.
-
getLinesOutput
long getLinesOutput()
- Returns:
- Returns the linesOutput.
-
getLinesRead
long getLinesRead()
- Returns:
- Returns the linesRead.
-
getLinesWritten
long getLinesWritten()
- Returns:
- Returns the linesWritten.
-
getLinesUpdated
long getLinesUpdated()
- Returns:
- Returns the linesUpdated.
-
setLinesRejected
void setLinesRejected(long linesRejected)
- Parameters:
linesRejected
- steps the lines rejected by error handling.
-
getLinesRejected
long getLinesRejected()
- Returns:
- Returns the lines rejected by error handling.
-
putRow
void putRow(org.pentaho.di.core.row.RowMetaInterface row, Object[] data) throws org.pentaho.di.core.exception.KettleException
Put a row on the destination rowsets.- Parameters:
row
- The row to send to the destinations steps- Throws:
org.pentaho.di.core.exception.KettleException
-
getRow
Object[] getRow() throws org.pentaho.di.core.exception.KettleException
- Returns:
- a row from the source step(s).
- Throws:
org.pentaho.di.core.exception.KettleException
-
setOutputDone
void setOutputDone()
Signal output done to destination steps
-
addRowListener
void addRowListener(RowListener rowListener)
Add a rowlistener to the step allowing you to inspect (or manipulate, be careful) the rows coming in or exiting the step.- Parameters:
rowListener
- the rowlistener to add
-
removeRowListener
void removeRowListener(RowListener rowListener)
Remove a rowlistener from this step.- Parameters:
rowListener
- the rowlistener to remove
-
getRowListeners
List<RowListener> getRowListeners()
- Returns:
- a list of the installed RowListeners
-
getInputRowSets
List<org.pentaho.di.core.RowSet> getInputRowSets()
- Returns:
- The list of active input rowsets for the step
-
getOutputRowSets
List<org.pentaho.di.core.RowSet> getOutputRowSets()
- Returns:
- The list of active output rowsets for the step
-
isPartitioned
boolean isPartitioned()
- Returns:
- true if the step is running partitioned
-
setPartitionID
void setPartitionID(String partitionID)
- Parameters:
partitionID
- the partitionID to set
-
getPartitionID
String getPartitionID()
- Returns:
- the steps partition ID
-
cleanup
void cleanup()
Call this method typically, after ALL the slave transformations in a clustered run have finished.
-
initBeforeStart
void initBeforeStart() throws org.pentaho.di.core.exception.KettleStepException
This method is executed by Trans right before the threads start and right after initialization.
!!! A plugin implementing this method should make sure to also call super.initBeforeStart(); !!!- Throws:
org.pentaho.di.core.exception.KettleStepException
- In case there is an error
-
addStepListener
void addStepListener(StepListener stepListener)
Attach a step listener to be notified when a step arrives in a certain state. (finished)- Parameters:
stepListener
- The listener to add to the step
-
isMapping
boolean isMapping()
- Returns:
- true if the thread is a special mapping step
-
getStepMeta
StepMeta getStepMeta()
- Returns:
- The metadata for this step
-
getLogChannel
org.pentaho.di.core.logging.LogChannelInterface getLogChannel()
- Specified by:
getLogChannel
in interfaceHasLogChannelInterface
- Returns:
- the logging channel for this step
-
setUsingThreadPriorityManagment
void setUsingThreadPriorityManagment(boolean usingThreadPriorityManagment)
- Parameters:
usingThreadPriorityManagment
- set to true to actively manage priorities of step threads
-
isUsingThreadPriorityManagment
boolean isUsingThreadPriorityManagment()
- Returns:
- true if we are actively managing priorities of step threads
-
rowsetInputSize
int rowsetInputSize()
- Returns:
- The total amount of rows in the input buffers
-
rowsetOutputSize
int rowsetOutputSize()
- Returns:
- The total amount of rows in the output buffers
-
getProcessed
long getProcessed()
- Returns:
- The number of "processed" lines of a step. Well, a representable metric for that anyway.
-
getResultFiles
Map<String,org.pentaho.di.core.ResultFile> getResultFiles()
- Returns:
- The result files for this step
-
getStatus
BaseStepData.StepExecutionStatus getStatus()
- Returns:
- the description as in
StepDataInterface
-
getRuntime
long getRuntime()
- Returns:
- The number of ms that this step has been running
-
identifyErrorOutput
void identifyErrorOutput()
To be used to flag an error output channel of a step prior to execution for performance reasons.
-
setPartitioned
void setPartitioned(boolean partitioned)
- Parameters:
partitioned
- true if this step is partitioned
-
setRepartitioning
void setRepartitioning(int partitioningMethod)
- Parameters:
partitioningMethod
- The repartitioning method
-
batchComplete
void batchComplete() throws org.pentaho.di.core.exception.KettleException
Calling this method will alert the step that we finished passing a batch of records to the step. Specifically for steps like "Sort Rows" it means that the buffered rows can be sorted and passed on.- Throws:
org.pentaho.di.core.exception.KettleException
- In case an error occurs during the processing of the batch of rows.
-
setMetaStore
void setMetaStore(org.pentaho.metastore.api.IMetaStore metaStore)
Pass along the metastore to use when loading external elements at runtime.- Parameters:
metaStore
- The metastore to use
-
getMetaStore
org.pentaho.metastore.api.IMetaStore getMetaStore()
- Returns:
- The metastore that the step uses to load external elements from.
-
setRepository
void setRepository(Repository repository)
- Parameters:
repository
- The repository used by the step to load and reference Kettle objects with at runtime
-
getRepository
Repository getRepository()
- Returns:
- The repository used by the step to load and reference Kettle objects with at runtime
-
getCurrentOutputRowSetNr
int getCurrentOutputRowSetNr()
- Returns:
- the index of the active (current) output row set
-
setCurrentOutputRowSetNr
void setCurrentOutputRowSetNr(int index)
- Parameters:
index
- Sets the index of the active (current) output row set to use.
-
getCurrentInputRowSetNr
int getCurrentInputRowSetNr()
- Returns:
- the index of the active (current) input row set
-
setCurrentInputRowSetNr
void setCurrentInputRowSetNr(int index)
- Parameters:
index
- Sets the index of the active (current) input row set to use.
-
subStatuses
default Collection<StepStatus> subStatuses()
-
addRowSetToInputRowSets
default void addRowSetToInputRowSets(org.pentaho.di.core.RowSet rowSet)
-
addRowSetToOutputRowSets
default void addRowSetToOutputRowSets(org.pentaho.di.core.RowSet rowSet)
-
-