Package org.pentaho.di.forecasting
Class WekaForecastingData
java.lang.Object
org.pentaho.di.trans.step.BaseStepData
org.pentaho.di.forecasting.WekaForecastingData
- All Implemented Interfaces:
org.pentaho.di.trans.step.StepDataInterface
public class WekaForecastingData
extends org.pentaho.di.trans.step.BaseStepData
implements org.pentaho.di.trans.step.StepDataInterface
Holds temporary data and has routines for loading serialized models.
- Version:
- $Revision$
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
-
Nested Class Summary
Nested classes/interfaces inherited from class org.pentaho.di.trans.step.BaseStepData
org.pentaho.di.trans.step.BaseStepData.StepExecutionStatus
-
Field Summary
Modifier and TypeFieldDescriptionprotected org.pentaho.di.core.row.RowMetaInterface
static final int
static final int
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected weka.core.Instance
constructInstance
(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model) Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.static int[]
findMappings
(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format.void
fixTypesForTargets
(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) Fixes the type of forecasted fields (if necessary).generateForecast
(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress) Generates a forecast given a forecasting model (sourced from the meta object).org.pentaho.di.core.row.RowMetaInterface
Get the meta data for the output formatstatic WekaForecastingModel
loadSerializedModel
(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log) Loads a serialized model.static void
saveSerializedModel
(WekaForecastingModel wsm, File saveTo) void
setOutputRowMeta
(org.pentaho.di.core.row.RowMetaInterface rmi) Set the meta data for the output formatboolean
sortCheck
(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data) Methods inherited from class org.pentaho.di.trans.step.BaseStepData
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.pentaho.di.trans.step.StepDataInterface
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, setStatus
-
Field Details
-
NO_MATCH
public static final int NO_MATCH- See Also:
-
TYPE_MISMATCH
public static final int TYPE_MISMATCH- See Also:
-
m_outputRowMeta
protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
-
-
Constructor Details
-
WekaForecastingData
public WekaForecastingData()
-
-
Method Details
-
getOutputRowMeta
public org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()Get the meta data for the output format- Returns:
- a
RowMetaInterface
value
-
setOutputRowMeta
public void setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi) Set the meta data for the output format- Parameters:
rmi
- aRowMetaInterface
value
-
fixTypesForTargets
public void fixTypesForTargets(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) throws org.pentaho.di.core.exception.KettleException Fixes the type of forecasted fields (if necessary). A forecaster forecasts target values as doubles. If incoming target fields values are non-double numeric types (as they might be for historical priming rows), then values need to be converted to Double to match the output row meta data.- Parameters:
row
- row to checktargetFields
- list of target fields predicted by the forecasterinputRowMeta
- the input row meta data- Throws:
org.pentaho.di.core.exception.KettleException
- if a problem occurs.
-
sortCheck
public boolean sortCheck(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data) throws org.pentaho.di.core.exception.KettleException - Throws:
org.pentaho.di.core.exception.KettleException
-
loadSerializedModel
public static WekaForecastingModel loadSerializedModel(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log) throws Exception Loads a serialized model. Models can either be binary serialized Java objects, objects deep-serialized to xml, or PMML.- Parameters:
modelFile
- aFile
value- Returns:
- the model
- Throws:
Exception
- if there is a problem laoding the model.
-
saveSerializedModel
- Throws:
Exception
-
findMappings
public static int[] findMappings(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format. Returns an array of indices, where the element at index 0 of the array is the index of the Kettle field that corresponds to the first attribute in the Instances structure, the element at index 1 is the index of the Kettle fields that corresponds to the second attribute, ...- Parameters:
header
- the Instances headerinputRowMeta
- the meta data for the incoming rows- Returns:
- the mapping as an array of integer indices
-
generateForecast
public List<Object[]> generateForecast(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress) throws Exception Generates a forecast given a forecasting model (sourced from the meta object).- Parameters:
inputMeta
- the incoming row meta dataoutputMeta
- the outgoing row meta datameta
- the forecasting metaoverlayData
- a list of rows for future time steps (in the same format as the incoming rows) containing values for "overlay" fields. May be null if overlay data is not in use.- Returns:
- a List of rows containing the forecast.
- Throws:
Exception
- if a problem occurs.
-
constructInstance
protected weka.core.Instance constructInstance(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model) Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.- Parameters:
inputMeta
- aRowMetaInterface
valueinputRow
- anObject
valuemappingIndexes
- anint
valuemodel
- aWekaScoringModel
value- Returns:
- an
Instance
value
-