Class WekaForecastingData

  • All Implemented Interfaces:
    org.pentaho.di.trans.step.StepDataInterface

    public class WekaForecastingData
    extends org.pentaho.di.trans.step.BaseStepData
    implements org.pentaho.di.trans.step.StepDataInterface
    Holds temporary data and has routines for loading serialized models.
    Version:
    $Revision$
    Author:
    Mark Hall (mhall{[at]}pentaho{[dot]}com)
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.pentaho.di.trans.step.BaseStepData

        org.pentaho.di.trans.step.BaseStepData.StepExecutionStatus
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected weka.core.Instance constructInstance​(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model)
      Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.
      static int[] findMappings​(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
      Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format.
      void fixTypesForTargets​(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
      Fixes the type of forecasted fields (if necessary).
      List<Object[]> generateForecast​(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress)
      Generates a forecast given a forecasting model (sourced from the meta object).
      org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
      Get the meta data for the output format
      static WekaForecastingModel loadSerializedModel​(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log)
      Loads a serialized model.
      static void saveSerializedModel​(WekaForecastingModel wsm, File saveTo)  
      void setOutputRowMeta​(org.pentaho.di.core.row.RowMetaInterface rmi)
      Set the meta data for the output format
      boolean sortCheck​(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data)  
      • Methods inherited from class org.pentaho.di.trans.step.BaseStepData

        getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus
      • Methods inherited from interface org.pentaho.di.trans.step.StepDataInterface

        getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, setStatus
    • Field Detail

      • m_outputRowMeta

        protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
    • Constructor Detail

      • WekaForecastingData

        public WekaForecastingData()
    • Method Detail

      • getOutputRowMeta

        public org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
        Get the meta data for the output format
        Returns:
        a RowMetaInterface value
      • setOutputRowMeta

        public void setOutputRowMeta​(org.pentaho.di.core.row.RowMetaInterface rmi)
        Set the meta data for the output format
        Parameters:
        rmi - a RowMetaInterface value
      • fixTypesForTargets

        public void fixTypesForTargets​(Object[] row,
                                       List<String> targetFields,
                                       org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
                                throws org.pentaho.di.core.exception.KettleException
        Fixes the type of forecasted fields (if necessary). A forecaster forecasts target values as doubles. If incoming target fields values are non-double numeric types (as they might be for historical priming rows), then values need to be converted to Double to match the output row meta data.
        Parameters:
        row - row to check
        targetFields - list of target fields predicted by the forecaster
        inputRowMeta - the input row meta data
        Throws:
        org.pentaho.di.core.exception.KettleException - if a problem occurs.
      • sortCheck

        public boolean sortCheck​(weka.classifiers.timeseries.TSForecaster forecaster,
                                 weka.core.Instances data)
                          throws org.pentaho.di.core.exception.KettleException
        Throws:
        org.pentaho.di.core.exception.KettleException
      • loadSerializedModel

        public static WekaForecastingModel loadSerializedModel​(File modelFile,
                                                               org.pentaho.di.core.logging.LogChannelInterface log)
                                                        throws Exception
        Loads a serialized model. Models can either be binary serialized Java objects, objects deep-serialized to xml, or PMML.
        Parameters:
        modelFile - a File value
        Returns:
        the model
        Throws:
        Exception - if there is a problem laoding the model.
      • findMappings

        public static int[] findMappings​(weka.core.Instances header,
                                         org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
        Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format. Returns an array of indices, where the element at index 0 of the array is the index of the Kettle field that corresponds to the first attribute in the Instances structure, the element at index 1 is the index of the Kettle fields that corresponds to the second attribute, ...
        Parameters:
        header - the Instances header
        inputRowMeta - the meta data for the incoming rows
        Returns:
        the mapping as an array of integer indices
      • generateForecast

        public List<Object[]> generateForecast​(org.pentaho.di.core.row.RowMetaInterface inputMeta,
                                               org.pentaho.di.core.row.RowMetaInterface outputMeta,
                                               WekaForecastingMeta meta,
                                               List<Object[]> overlayData,
                                               org.pentaho.di.trans.TransMeta transMeta,
                                               PrintStream... progress)
                                        throws Exception
        Generates a forecast given a forecasting model (sourced from the meta object).
        Parameters:
        inputMeta - the incoming row meta data
        outputMeta - the outgoing row meta data
        meta - the forecasting meta
        overlayData - a list of rows for future time steps (in the same format as the incoming rows) containing values for "overlay" fields. May be null if overlay data is not in use.
        Returns:
        a List of rows containing the forecast.
        Throws:
        Exception - if a problem occurs.
      • constructInstance

        protected weka.core.Instance constructInstance​(org.pentaho.di.core.row.RowMetaInterface inputMeta,
                                                       Object[] inputRow,
                                                       int[] mappingIndexes,
                                                       WekaForecastingModel model)
        Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.
        Parameters:
        inputMeta - a RowMetaInterface value
        inputRow - an Object value
        mappingIndexes - an int value
        model - a WekaScoringModel value
        Returns:
        an Instance value