Package org.pentaho.di.scoring
Class WekaScoringMeta
java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.scoring.WekaScoringMeta
- All Implemented Interfaces:
Cloneable
,org.pentaho.di.trans.step.StepAttributesInterface
,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="WekaScoring",
image="WEKAS.svg",
name="Weka Scoring",
description="Appends predictions from a pre-built Weka model",
categoryDescription="Data Mining",
documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin")
public class WekaScoringMeta
extends org.pentaho.di.trans.step.BaseStepMeta
implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the WekaScoring step.
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}org)
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
Batch scoring sizeprotected static Class<?>
static final String
Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta
attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
check
(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.clone()
Clone this step's meta dataprotected void
deSerializeBase64Model
(String base64modelXML) boolean
Check for equalityGet the batch size to use if the model is a batch scoring modelboolean
Get whether to cache loaded models in memoryGets the default model (only used when model file names are being sourced from a field in the incoming rows).Get the name of the incoming field that holds paths to model filesvoid
getFields
(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) Generates row meta data to represent the fields output by this stepboolean
Get whether filename is coming from an incoming fieldgetModel()
Get the Weka modelboolean
Get whether to predict probabilitiesGet the file name that the incrementally updated model will be saved to when the current stream of data terminatesGet the filename of the serialized Weka model to load/import fromorg.pentaho.di.trans.step.StepInterface
getStep
(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterface
Get a new instance of the appropriate data class.boolean
boolean
Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).getXML()
Return the XML describing this (configured) stepprotected String
getXML
(boolean logging) int
hashCode()
Hash code methodprotected void
void
loadXML
(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) Loads the meta data for this (configured) step from XML.void
readRep
(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) Read this step's configuration from a repositoryvoid
saveRep
(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) Save this step's meta data to a repositoryvoid
setBatchScoringSize
(String size) Set the batch size to use if the model is a batch scoring modelvoid
setCacheLoadedModels
(boolean l) Set whether to cache loaded models in memoryvoid
void
setDefaultModel
(WekaScoringModel defaultM) Sets the default model (only used when model file names are being sourced from a field in the incoming rows).void
Set the name of the incoming field that holds paths to model filesvoid
setFileNameFromField
(boolean f) Set whether filename is coming from an incoming fieldvoid
setModel
(WekaScoringModel model) Set the Weka modelvoid
setOutputProbabilities
(boolean b) Set whether to predict probabilitiesvoid
setSavedModelFileName
(String savedM) Set the file name that the incrementally updated model will be saved to when the current stream of data terminatesvoid
setSerializedModelFileName
(String mfile) Set the file name of the serialized Weka model to load/import fromvoid
setStoreModelInStepMetaData
(boolean b) void
setUpdateIncrementalModel
(boolean u) Set whether to update the model incrementallyMethods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
Field Details
-
PKG
-
XML_TAG
- See Also:
-
DEFAULT_BATCH_SCORING_SIZE
public static final int DEFAULT_BATCH_SCORING_SIZEBatch scoring size- See Also:
-
-
Constructor Details
-
WekaScoringMeta
public WekaScoringMeta()Creates a newWekaScoringMeta
instance.
-
-
Method Details
-
setStoreModelInStepMetaData
public void setStoreModelInStepMetaData(boolean b) -
getStoreModelInStepMetaData
public boolean getStoreModelInStepMetaData() -
setBatchScoringSize
Set the batch size to use if the model is a batch scoring model- Parameters:
size
- the size of the batch to use
-
getBatchScoringSize
Get the batch size to use if the model is a batch scoring model- Returns:
- the size of the batch to use
-
setFileNameFromField
public void setFileNameFromField(boolean f) Set whether filename is coming from an incoming field- Parameters:
f
- true if the model to use is specified via path in an incoming field value
-
getFileNameFromField
public boolean getFileNameFromField()Get whether filename is coming from an incoming field- Returns:
- true if the model to use is specified via path in an incoming field value
-
setCacheLoadedModels
public void setCacheLoadedModels(boolean l) Set whether to cache loaded models in memory- Parameters:
l
- true if models are to be cached in memory
-
getCacheLoadedModels
public boolean getCacheLoadedModels()Get whether to cache loaded models in memory- Returns:
- true if models are to be cached in memory
-
setFieldNameToLoadModelFrom
Set the name of the incoming field that holds paths to model files- Parameters:
fn
- the name of the incoming field that holds model paths
-
getFieldNameToLoadModelFrom
Get the name of the incoming field that holds paths to model files- Returns:
- the name of the incoming field that holds model paths
-
setSerializedModelFileName
Set the file name of the serialized Weka model to load/import from- Parameters:
mfile
- the file name
-
getSerializedModelFileName
Get the filename of the serialized Weka model to load/import from- Returns:
- the file name
-
setSavedModelFileName
Set the file name that the incrementally updated model will be saved to when the current stream of data terminates- Parameters:
savedM
- the file name to save to
-
getSavedModelFileName
Get the file name that the incrementally updated model will be saved to when the current stream of data terminates- Returns:
- the file name to save to
-
setModel
Set the Weka model- Parameters:
model
- aWekaScoringModel
that encapsulates the actual Weka model (Classifier or Clusterer)
-
getModel
Get the Weka model- Returns:
- a
WekaScoringModel
that encapsulates the actual Weka model (Classifier or Clusterer)
-
getDefaultModel
Gets the default model (only used when model file names are being sourced from a field in the incoming rows).- Returns:
- the default model to use when there is no filename provided in the incoming data row.
-
setDefaultModel
Sets the default model (only used when model file names are being sourced from a field in the incoming rows).- Parameters:
defaultM
- the default model to use.
-
setOutputProbabilities
public void setOutputProbabilities(boolean b) Set whether to predict probabilities- Parameters:
b
- true if a probability distribution is to be output
-
getOutputProbabilities
public boolean getOutputProbabilities()Get whether to predict probabilities- Returns:
- a true if a probability distribution is to be output
-
getUpdateIncrementalModel
public boolean getUpdateIncrementalModel()Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).- Returns:
- a true if the model is to be updated incrementally with each incoming row
-
setUpdateIncrementalModel
public void setUpdateIncrementalModel(boolean u) Set whether to update the model incrementally- Parameters:
u
- true if the model should be updated with each incoming row (after predicting it)
-
getXML
-
getXML
Return the XML describing this (configured) step- Specified by:
getXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- a
String
containing the XML
-
equals
Check for equality -
hashCode
public int hashCode()Hash code method -
clone
Clone this step's meta data- Specified by:
clone
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
clone
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- the cloned meta data
-
setDefault
public void setDefault()- Specified by:
setDefault
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLExceptionLoads the meta data for this (configured) step from XML.- Specified by:
loadXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
loadXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
stepnode
- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException
- if an error occurs
-
loadModelFile
- Throws:
Exception
-
deSerializeBase64Model
- Throws:
Exception
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleExceptionRead this step's configuration from a repository- Specified by:
readRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
readRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to accessid_step
- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException Save this step's meta data to a repository- Specified by:
saveRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
saveRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to save toid_transformation
- transformation idid_step
- step id- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException Generates row meta data to represent the fields output by this step- Specified by:
getFields
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getFields
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
row
- the meta data for the output producedorigin
- the name of the step to be used as the origininfo
- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep
- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space
- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException
- if an error occurs
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.- Specified by:
check
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
check
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
remarks
- the list to put the remarks in. seeorg.pentaho.di.core.CheckResult
transmeta
- the transform meta datastepMeta
- the step meta dataprev
- the fields coming from a previous stepinput
- the input step namesoutput
- the output step namesinfo
- the fields that are used as information by the step
-
getDialogClassName
- Specified by:
getDialogClassName
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getDialogClassName
in classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.- Specified by:
getStep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Parameters:
stepMeta
- the step infostepDataInterface
- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr
- the copy number to get.tr
- the transformation info.trans
- the launching transformation- Returns:
- a
StepInterface
value
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepData
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Returns:
- a
StepDataInterface
value
-