Package org.pentaho.di.scoring
Class WekaScoringMeta
java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.scoring.WekaScoringMeta
- All Implemented Interfaces:
Cloneable,org.pentaho.di.trans.step.StepAttributesInterface,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="WekaScoring",
image="WEKAS.svg",
name="Weka Scoring",
description="Appends predictions from a pre-built Weka model",
categoryDescription="Data Mining",
documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin")
public class WekaScoringMeta
extends org.pentaho.di.trans.step.BaseStepMeta
implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the WekaScoring step.
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}org)
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intBatch scoring sizeprotected static Class<?>static final StringFields inherited from class org.pentaho.di.trans.step.BaseStepMeta
attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheck(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.clone()Clone this step's meta dataprotected voiddeSerializeBase64Model(String base64modelXML) booleanCheck for equalityGet the batch size to use if the model is a batch scoring modelbooleanGet whether to cache loaded models in memoryGets the default model (only used when model file names are being sourced from a field in the incoming rows).Get the name of the incoming field that holds paths to model filesvoidgetFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) Generates row meta data to represent the fields output by this stepbooleanGet whether filename is coming from an incoming fieldgetModel()Get the Weka modelbooleanGet whether to predict probabilitiesGet the file name that the incrementally updated model will be saved to when the current stream of data terminatesGet the filename of the serialized Weka model to load/import fromorg.pentaho.di.trans.step.StepInterfacegetStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterfaceGet a new instance of the appropriate data class.booleanbooleanGet whether the model is to be incrementally updated with each incoming row (after making a prediction for it).getXML()Return the XML describing this (configured) stepprotected StringgetXML(boolean logging) inthashCode()Hash code methodprotected voidvoidloadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) Loads the meta data for this (configured) step from XML.voidreadRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) Read this step's configuration from a repositoryvoidsaveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) Save this step's meta data to a repositoryvoidsetBatchScoringSize(String size) Set the batch size to use if the model is a batch scoring modelvoidsetCacheLoadedModels(boolean l) Set whether to cache loaded models in memoryvoidvoidsetDefaultModel(WekaScoringModel defaultM) Sets the default model (only used when model file names are being sourced from a field in the incoming rows).voidSet the name of the incoming field that holds paths to model filesvoidsetFileNameFromField(boolean f) Set whether filename is coming from an incoming fieldvoidsetModel(WekaScoringModel model) Set the Weka modelvoidsetOutputProbabilities(boolean b) Set whether to predict probabilitiesvoidsetSavedModelFileName(String savedM) Set the file name that the incrementally updated model will be saved to when the current stream of data terminatesvoidsetSerializedModelFileName(String mfile) Set the file name of the serialized Weka model to load/import fromvoidsetStoreModelInStepMetaData(boolean b) voidsetUpdateIncrementalModel(boolean u) Set whether to update the model incrementallyMethods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandlingMethods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
Field Details
-
PKG
-
XML_TAG
- See Also:
-
DEFAULT_BATCH_SCORING_SIZE
public static final int DEFAULT_BATCH_SCORING_SIZEBatch scoring size- See Also:
-
-
Constructor Details
-
WekaScoringMeta
public WekaScoringMeta()Creates a newWekaScoringMetainstance.
-
-
Method Details
-
setStoreModelInStepMetaData
public void setStoreModelInStepMetaData(boolean b) -
getStoreModelInStepMetaData
public boolean getStoreModelInStepMetaData() -
setBatchScoringSize
Set the batch size to use if the model is a batch scoring model- Parameters:
size- the size of the batch to use
-
getBatchScoringSize
Get the batch size to use if the model is a batch scoring model- Returns:
- the size of the batch to use
-
setFileNameFromField
public void setFileNameFromField(boolean f) Set whether filename is coming from an incoming field- Parameters:
f- true if the model to use is specified via path in an incoming field value
-
getFileNameFromField
public boolean getFileNameFromField()Get whether filename is coming from an incoming field- Returns:
- true if the model to use is specified via path in an incoming field value
-
setCacheLoadedModels
public void setCacheLoadedModels(boolean l) Set whether to cache loaded models in memory- Parameters:
l- true if models are to be cached in memory
-
getCacheLoadedModels
public boolean getCacheLoadedModels()Get whether to cache loaded models in memory- Returns:
- true if models are to be cached in memory
-
setFieldNameToLoadModelFrom
Set the name of the incoming field that holds paths to model files- Parameters:
fn- the name of the incoming field that holds model paths
-
getFieldNameToLoadModelFrom
Get the name of the incoming field that holds paths to model files- Returns:
- the name of the incoming field that holds model paths
-
setSerializedModelFileName
Set the file name of the serialized Weka model to load/import from- Parameters:
mfile- the file name
-
getSerializedModelFileName
Get the filename of the serialized Weka model to load/import from- Returns:
- the file name
-
setSavedModelFileName
Set the file name that the incrementally updated model will be saved to when the current stream of data terminates- Parameters:
savedM- the file name to save to
-
getSavedModelFileName
Get the file name that the incrementally updated model will be saved to when the current stream of data terminates- Returns:
- the file name to save to
-
setModel
Set the Weka model- Parameters:
model- aWekaScoringModelthat encapsulates the actual Weka model (Classifier or Clusterer)
-
getModel
Get the Weka model- Returns:
- a
WekaScoringModelthat encapsulates the actual Weka model (Classifier or Clusterer)
-
getDefaultModel
Gets the default model (only used when model file names are being sourced from a field in the incoming rows).- Returns:
- the default model to use when there is no filename provided in the incoming data row.
-
setDefaultModel
Sets the default model (only used when model file names are being sourced from a field in the incoming rows).- Parameters:
defaultM- the default model to use.
-
setOutputProbabilities
public void setOutputProbabilities(boolean b) Set whether to predict probabilities- Parameters:
b- true if a probability distribution is to be output
-
getOutputProbabilities
public boolean getOutputProbabilities()Get whether to predict probabilities- Returns:
- a true if a probability distribution is to be output
-
getUpdateIncrementalModel
public boolean getUpdateIncrementalModel()Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).- Returns:
- a true if the model is to be updated incrementally with each incoming row
-
setUpdateIncrementalModel
public void setUpdateIncrementalModel(boolean u) Set whether to update the model incrementally- Parameters:
u- true if the model should be updated with each incoming row (after predicting it)
-
getXML
-
getXML
Return the XML describing this (configured) step- Specified by:
getXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- a
Stringcontaining the XML
-
equals
Check for equality -
hashCode
public int hashCode()Hash code method -
clone
Clone this step's meta data- Specified by:
clonein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
clonein classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- the cloned meta data
-
setDefault
public void setDefault()- Specified by:
setDefaultin interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLExceptionLoads the meta data for this (configured) step from XML.- Specified by:
loadXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
loadXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
stepnode- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException- if an error occurs
-
loadModelFile
- Throws:
Exception
-
deSerializeBase64Model
- Throws:
Exception
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleExceptionRead this step's configuration from a repository- Specified by:
readRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
readRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to accessid_step- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException Save this step's meta data to a repository- Specified by:
saveRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
saveRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to save toid_transformation- transformation idid_step- step id- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException Generates row meta data to represent the fields output by this step- Specified by:
getFieldsin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getFieldsin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
row- the meta data for the output producedorigin- the name of the step to be used as the origininfo- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException- if an error occurs
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.- Specified by:
checkin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
checkin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
remarks- the list to put the remarks in. seeorg.pentaho.di.core.CheckResulttransmeta- the transform meta datastepMeta- the step meta dataprev- the fields coming from a previous stepinput- the input step namesoutput- the output step namesinfo- the fields that are used as information by the step
-
getDialogClassName
- Specified by:
getDialogClassNamein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getDialogClassNamein classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.- Specified by:
getStepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Parameters:
stepMeta- the step infostepDataInterface- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr- the copy number to get.tr- the transformation info.trans- the launching transformation- Returns:
- a
StepInterfacevalue
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepDatain interfaceorg.pentaho.di.trans.step.StepMetaInterface- Returns:
- a
StepDataInterfacevalue
-