Class ReservoirSampling

java.lang.Object
org.pentaho.di.trans.step.BaseStep
org.pentaho.di.trans.steps.reservoirsampling.ReservoirSampling
All Implemented Interfaces:
org.pentaho.di.core.ExtensionDataInterface, HasLogChannelInterface, org.pentaho.di.core.logging.LoggingObjectInterface, org.pentaho.di.core.logging.LoggingObjectLifecycleInterface, org.pentaho.di.core.variables.VariableSpace, StepInterface

public class ReservoirSampling extends BaseStep implements StepInterface
  • Constructor Details

    • ReservoirSampling

      public ReservoirSampling(StepMeta stepMeta, StepDataInterface stepDataInterface, int copyNr, TransMeta transMeta, Trans trans)
      Creates a new ReservoirSampling instance.

      Implements the reservoir sampling algorithm "R" by Jeffrey Scott Vitter. (algorithm is implemented in ReservoirSamplingData.java

      For more information see:

      Vitter, J. S. Random Sampling with a Reservoir. ACM Transactions on Mathematical Software, Vol. 11, No. 1, March 1985. Pages 37-57.

      Parameters:
      stepMeta - holds the step's meta data
      stepDataInterface - holds the step's temporary data
      copyNr - the number assigned to the step
      transMeta - meta data for the transformation
      trans - a Trans value
  • Method Details

    • processRow

      public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws org.pentaho.di.core.exception.KettleException
      Process an incoming row of data.
      Specified by:
      processRow in interface StepInterface
      Overrides:
      processRow in class BaseStep
      Parameters:
      smi - a StepMetaInterface value
      sdi - a StepDataInterface value
      Returns:
      a boolean value
      Throws:
      org.pentaho.di.core.exception.KettleException - if an error occurs
    • init

      public boolean init(StepMetaInterface smi, StepDataInterface sdi)
      Initialize the step.
      Specified by:
      init in interface StepInterface
      Overrides:
      init in class BaseStep
      Parameters:
      smi - a StepMetaInterface value
      sdi - a StepDataInterface value
      Returns:
      a boolean value
    • run

      public void run()
      Run is where the action happens!