Package org.pentaho.di.forecasting
Class WekaForecastingData
- java.lang.Object
-
- org.pentaho.di.trans.step.BaseStepData
-
- org.pentaho.di.forecasting.WekaForecastingData
-
- All Implemented Interfaces:
org.pentaho.di.trans.step.StepDataInterface
public class WekaForecastingData extends org.pentaho.di.trans.step.BaseStepData implements org.pentaho.di.trans.step.StepDataInterface
Holds temporary data and has routines for loading serialized models.- Version:
- $Revision$
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
-
-
Field Summary
Fields Modifier and Type Field Description protected org.pentaho.di.core.row.RowMetaInterface
m_outputRowMeta
static int
NO_MATCH
static int
TYPE_MISMATCH
-
Constructor Summary
Constructors Constructor Description WekaForecastingData()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected weka.core.Instance
constructInstance(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model)
Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.static int[]
findMappings(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format.void
fixTypesForTargets(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
Fixes the type of forecasted fields (if necessary).List<Object[]>
generateForecast(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress)
Generates a forecast given a forecasting model (sourced from the meta object).org.pentaho.di.core.row.RowMetaInterface
getOutputRowMeta()
Get the meta data for the output formatstatic WekaForecastingModel
loadSerializedModel(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log)
Loads a serialized model.static void
saveSerializedModel(WekaForecastingModel wsm, File saveTo)
void
setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
Set the meta data for the output formatboolean
sortCheck(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data)
-
Methods inherited from class org.pentaho.di.trans.step.BaseStepData
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus
-
-
-
-
Field Detail
-
NO_MATCH
public static final int NO_MATCH
- See Also:
- Constant Field Values
-
TYPE_MISMATCH
public static final int TYPE_MISMATCH
- See Also:
- Constant Field Values
-
m_outputRowMeta
protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
-
-
Method Detail
-
getOutputRowMeta
public org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
Get the meta data for the output format- Returns:
- a
RowMetaInterface
value
-
setOutputRowMeta
public void setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
Set the meta data for the output format- Parameters:
rmi
- aRowMetaInterface
value
-
fixTypesForTargets
public void fixTypesForTargets(Object[] row, List<String> targetFields, org.pentaho.di.core.row.RowMetaInterface inputRowMeta) throws org.pentaho.di.core.exception.KettleException
Fixes the type of forecasted fields (if necessary). A forecaster forecasts target values as doubles. If incoming target fields values are non-double numeric types (as they might be for historical priming rows), then values need to be converted to Double to match the output row meta data.- Parameters:
row
- row to checktargetFields
- list of target fields predicted by the forecasterinputRowMeta
- the input row meta data- Throws:
org.pentaho.di.core.exception.KettleException
- if a problem occurs.
-
sortCheck
public boolean sortCheck(weka.classifiers.timeseries.TSForecaster forecaster, weka.core.Instances data) throws org.pentaho.di.core.exception.KettleException
- Throws:
org.pentaho.di.core.exception.KettleException
-
loadSerializedModel
public static WekaForecastingModel loadSerializedModel(File modelFile, org.pentaho.di.core.logging.LogChannelInterface log) throws Exception
Loads a serialized model. Models can either be binary serialized Java objects, objects deep-serialized to xml, or PMML.- Parameters:
modelFile
- aFile
value- Returns:
- the model
- Throws:
Exception
- if there is a problem laoding the model.
-
saveSerializedModel
public static void saveSerializedModel(WekaForecastingModel wsm, File saveTo) throws Exception
- Throws:
Exception
-
findMappings
public static int[] findMappings(weka.core.Instances header, org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
Finds a mapping between the attributes that a forecasting model has been trained with and the incoming Kettle row format. Returns an array of indices, where the element at index 0 of the array is the index of the Kettle field that corresponds to the first attribute in the Instances structure, the element at index 1 is the index of the Kettle fields that corresponds to the second attribute, ...- Parameters:
header
- the Instances headerinputRowMeta
- the meta data for the incoming rows- Returns:
- the mapping as an array of integer indices
-
generateForecast
public List<Object[]> generateForecast(org.pentaho.di.core.row.RowMetaInterface inputMeta, org.pentaho.di.core.row.RowMetaInterface outputMeta, WekaForecastingMeta meta, List<Object[]> overlayData, org.pentaho.di.trans.TransMeta transMeta, PrintStream... progress) throws Exception
Generates a forecast given a forecasting model (sourced from the meta object).- Parameters:
inputMeta
- the incoming row meta dataoutputMeta
- the outgoing row meta datameta
- the forecasting metaoverlayData
- a list of rows for future time steps (in the same format as the incoming rows) containing values for "overlay" fields. May be null if overlay data is not in use.- Returns:
- a List of rows containing the forecast.
- Throws:
Exception
- if a problem occurs.
-
constructInstance
protected weka.core.Instance constructInstance(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow, int[] mappingIndexes, WekaForecastingModel model)
Helper method that constructs an Instance to input to the Weka model based on incoming Kettle fields and pre-constructed attribute-to-field mapping data.- Parameters:
inputMeta
- aRowMetaInterface
valueinputRow
- anObject
valuemappingIndexes
- anint
valuemodel
- aWekaScoringModel
value- Returns:
- an
Instance
value
-
-