Class WekaForecastingData

java.lang.Object
org.pentaho.di.trans.step.BaseStepData
org.pentaho.di.forecasting.WekaForecastingData
All Implemented Interfaces:
org.pentaho.di.trans.step.StepDataInterface

public class WekaForecastingData extends org.pentaho.di.trans.step.BaseStepData implements org.pentaho.di.trans.step.StepDataInterface
Holds temporary data and has routines for loading serialized models.
Version:
$Revision$
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}com)
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.pentaho.di.trans.step.BaseStepData

    org.pentaho.di.trans.step.BaseStepData.StepExecutionStatus
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected org.pentaho.di.core.row.RowMetaInterface
     
    static final int
     
    static final int
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected weka.core.Instance
    constructInstance(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model)
    Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.
    static int[]
    findMappings(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
    Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format.
    void
    fixTypesForTargets(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
    Fixes the type of forecasted fields (if necessary).
    generateForecast(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress)
    Generates a forecast given a forecasting model (sourced from the meta object).
    org.pentaho.di.core.row.RowMetaInterface
    Get the meta data for the output format
    loadSerializedModel(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log)
    Loads a serialized model.
    static void
     
    void
    setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
    Set the meta data for the output format
    boolean
    sortCheck(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data)
     

    Methods inherited from class org.pentaho.di.trans.step.BaseStepData

    getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.pentaho.di.trans.step.StepDataInterface

    getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, setStatus
  • Field Details

    • NO_MATCH

      public static final int NO_MATCH
      See Also:
    • TYPE_MISMATCH

      public static final int TYPE_MISMATCH
      See Also:
    • m_outputRowMeta

      protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
  • Constructor Details

    • WekaForecastingData

      public WekaForecastingData()
  • Method Details

    • getOutputRowMeta

      public org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
      Get the meta data for the output format
      Returns:
      a RowMetaInterface value
    • setOutputRowMeta

      public void setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
      Set the meta data for the output format
      Parameters:
      rmi - a RowMetaInterface value
    • fixTypesForTargets

      public void fixTypesForTargets(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) throws org.pentaho.di.core.exception.KettleException
      Fixes the type of forecasted fields (if necessary). A forecaster forecasts target values as doubles. If incoming target fields values are non-double numeric types (as they might be for historical priming rows), then values need to be converted to Double to match the output row meta data.
      Parameters:
      row - row to check
      targetFields - list of target fields predicted by the forecaster
      inputRowMeta - the input row meta data
      Throws:
      org.pentaho.di.core.exception.KettleException - if a problem occurs.
    • sortCheck

      public boolean sortCheck(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data) throws org.pentaho.di.core.exception.KettleException
      Throws:
      org.pentaho.di.core.exception.KettleException
    • loadSerializedModel

      public static WekaForecastingModel loadSerializedModel(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log) throws Exception
      Loads a serialized model. Models can either be binary serialized Java objects, objects deep-serialized to xml, or PMML.
      Parameters:
      modelFile - a File value
      Returns:
      the model
      Throws:
      Exception - if there is a problem laoding the model.
    • saveSerializedModel

      public static void saveSerializedModel(WekaForecastingModel wsm, File saveTo) throws Exception
      Throws:
      Exception
    • findMappings

      public static int[] findMappings(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
      Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format. Returns an array of indices, where the element at index 0 of the array is the index of the Kettle field that corresponds to the first attribute in the Instances structure, the element at index 1 is the index of the Kettle fields that corresponds to the second attribute, ...
      Parameters:
      header - the Instances header
      inputRowMeta - the meta data for the incoming rows
      Returns:
      the mapping as an array of integer indices
    • generateForecast

      public List<Object[]> generateForecast(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress) throws Exception
      Generates a forecast given a forecasting model (sourced from the meta object).
      Parameters:
      inputMeta - the incoming row meta data
      outputMeta - the outgoing row meta data
      meta - the forecasting meta
      overlayData - a list of rows for future time steps (in the same format as the incoming rows) containing values for "overlay" fields. May be null if overlay data is not in use.
      Returns:
      a List of rows containing the forecast.
      Throws:
      Exception - if a problem occurs.
    • constructInstance

      protected weka.core.Instance constructInstance(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model)
      Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.
      Parameters:
      inputMeta - a RowMetaInterface value
      inputRow - an Object value
      mappingIndexes - an int value
      model - a WekaScoringModel value
      Returns:
      an Instance value