|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.pentaho.di.trans.step.BaseStepData org.pentaho.di.trans.steps.reservoirsampling.ReservoirSamplingData
public class ReservoirSamplingData
Holds temporary data (i.e. sampled rows). Implements the reservoir sampling algorithm "R" by Jeffrey Scott Vitter.
For more information see:
Vitter, J. S. Random Sampling with a Reservoir. ACM
Transactions on Mathematical Software, Vol. 11, No. 1,
March 1985. Pages 37-57.
Nested Class Summary | |
---|---|
static class |
ReservoirSamplingData.PROC_MODE
|
Nested classes/interfaces inherited from class org.pentaho.di.trans.step.BaseStepData |
---|
BaseStepData.StepExecutionStatus |
Constructor Summary | |
---|---|
ReservoirSamplingData()
|
Method Summary | |
---|---|
void |
cleanUp()
|
RowMetaInterface |
getOutputRowMeta()
Get the output meta data |
ReservoirSamplingData.PROC_MODE |
getProcessingMode()
Determine the current operational state of the Reservoir Sampling step. |
List<Object[]> |
getSample()
Gets the sample as an array of rows |
void |
initialize(int sampleSize,
int seed)
Initialize this data object |
void |
processRow(Object[] row)
Here is where the action happens. |
void |
setOutputRowMeta(RowMetaInterface rmi)
Set the meta data for the output format |
void |
setProcessingMode(ReservoirSamplingData.PROC_MODE state)
Set this component to sample, pass through or be disabled |
Methods inherited from class org.pentaho.di.trans.step.BaseStepData |
---|
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.pentaho.di.trans.step.StepDataInterface |
---|
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, setStatus |
Constructor Detail |
---|
public ReservoirSamplingData()
Method Detail |
---|
public void setOutputRowMeta(RowMetaInterface rmi)
rmi
- a RowMetaInterface
valuepublic RowMetaInterface getOutputRowMeta()
RowMetaInterface
valuepublic List<Object[]> getSample()
public void initialize(int sampleSize, int seed)
sampleSize
- the number of rows to sampleseed
- the seed for the random number generatorpublic ReservoirSamplingData.PROC_MODE getProcessingMode()
public void setProcessingMode(ReservoirSamplingData.PROC_MODE state)
state
- member of PROC_MODE enumeration
indicating the desired operational statepublic void processRow(Object[] row)
row
- an incoming rowpublic void cleanUp()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |