Package org.pentaho.di.scoring
Class WekaScoringMeta
- java.lang.Object
-
- org.pentaho.di.trans.step.BaseStepMeta
-
- org.pentaho.di.scoring.WekaScoringMeta
-
- All Implemented Interfaces:
Cloneable,org.pentaho.di.trans.step.StepAttributesInterface,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="WekaScoring", image="WEKAS.svg", name="Weka Scoring", description="Appends predictions from a pre-built Weka model", categoryDescription="Data Mining", documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin") public class WekaScoringMeta extends org.pentaho.di.trans.step.BaseStepMeta implements org.pentaho.di.trans.step.StepMetaInterfaceContains the meta data for the WekaScoring step.- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}org)
-
-
Field Summary
Fields Modifier and Type Field Description static intDEFAULT_BATCH_SCORING_SIZEBatch scoring sizeprotected static Class<?>PKGstatic StringXML_TAG
-
Constructor Summary
Constructors Constructor Description WekaScoringMeta()Creates a newWekaScoringMetainstance.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcheck(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)Check the settings of this step and put findings in a remarks list.Objectclone()Clone this step's meta dataprotected voiddeSerializeBase64Model(String base64modelXML)booleanequals(Object obj)Check for equalityStringgetBatchScoringSize()Get the batch size to use if the model is a batch scoring modelbooleangetCacheLoadedModels()Get whether to cache loaded models in memoryWekaScoringModelgetDefaultModel()Gets the default model (only used when model file names are being sourced from a field in the incoming rows).StringgetDialogClassName()StringgetFieldNameToLoadModelFrom()Get the name of the incoming field that holds paths to model filesvoidgetFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)Generates row meta data to represent the fields output by this stepbooleangetFileNameFromField()Get whether filename is coming from an incoming fieldWekaScoringModelgetModel()Get the Weka modelbooleangetOutputProbabilities()Get whether to predict probabilitiesStringgetSavedModelFileName()Get the file name that the incrementally updated model will be saved to when the current stream of data terminatesStringgetSerializedModelFileName()Get the filename of the serialized Weka model to load/import fromorg.pentaho.di.trans.step.StepInterfacegetStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterfacegetStepData()Get a new instance of the appropriate data class.booleangetStoreModelInStepMetaData()booleangetUpdateIncrementalModel()Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).StringgetXML()Return the XML describing this (configured) stepprotected StringgetXML(boolean logging)inthashCode()Hash code methodprotected voidloadModelFile()voidloadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)Loads the meta data for this (configured) step from XML.voidreadRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)Read this step's configuration from a repositoryvoidsaveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)Save this step's meta data to a repositoryvoidsetBatchScoringSize(String size)Set the batch size to use if the model is a batch scoring modelvoidsetCacheLoadedModels(boolean l)Set whether to cache loaded models in memoryvoidsetDefault()voidsetDefaultModel(WekaScoringModel defaultM)Sets the default model (only used when model file names are being sourced from a field in the incoming rows).voidsetFieldNameToLoadModelFrom(String fn)Set the name of the incoming field that holds paths to model filesvoidsetFileNameFromField(boolean f)Set whether filename is coming from an incoming fieldvoidsetModel(WekaScoringModel model)Set the Weka modelvoidsetOutputProbabilities(boolean b)Set whether to predict probabilitiesvoidsetSavedModelFileName(String savedM)Set the file name that the incrementally updated model will be saved to when the current stream of data terminatesvoidsetSerializedModelFileName(String mfile)Set the file name of the serialized Weka model to load/import fromvoidsetStoreModelInStepMetaData(boolean b)voidsetUpdateIncrementalModel(boolean u)Set whether to update the model incrementally-
Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
-
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
-
-
-
Field Detail
-
PKG
protected static Class<?> PKG
-
XML_TAG
public static final String XML_TAG
- See Also:
- Constant Field Values
-
DEFAULT_BATCH_SCORING_SIZE
public static final int DEFAULT_BATCH_SCORING_SIZE
Batch scoring size- See Also:
- Constant Field Values
-
-
Method Detail
-
setStoreModelInStepMetaData
public void setStoreModelInStepMetaData(boolean b)
-
getStoreModelInStepMetaData
public boolean getStoreModelInStepMetaData()
-
setBatchScoringSize
public void setBatchScoringSize(String size)
Set the batch size to use if the model is a batch scoring model- Parameters:
size- the size of the batch to use
-
getBatchScoringSize
public String getBatchScoringSize()
Get the batch size to use if the model is a batch scoring model- Returns:
- the size of the batch to use
-
setFileNameFromField
public void setFileNameFromField(boolean f)
Set whether filename is coming from an incoming field- Parameters:
f- true if the model to use is specified via path in an incoming field value
-
getFileNameFromField
public boolean getFileNameFromField()
Get whether filename is coming from an incoming field- Returns:
- true if the model to use is specified via path in an incoming field value
-
setCacheLoadedModels
public void setCacheLoadedModels(boolean l)
Set whether to cache loaded models in memory- Parameters:
l- true if models are to be cached in memory
-
getCacheLoadedModels
public boolean getCacheLoadedModels()
Get whether to cache loaded models in memory- Returns:
- true if models are to be cached in memory
-
setFieldNameToLoadModelFrom
public void setFieldNameToLoadModelFrom(String fn)
Set the name of the incoming field that holds paths to model files- Parameters:
fn- the name of the incoming field that holds model paths
-
getFieldNameToLoadModelFrom
public String getFieldNameToLoadModelFrom()
Get the name of the incoming field that holds paths to model files- Returns:
- the name of the incoming field that holds model paths
-
setSerializedModelFileName
public void setSerializedModelFileName(String mfile)
Set the file name of the serialized Weka model to load/import from- Parameters:
mfile- the file name
-
getSerializedModelFileName
public String getSerializedModelFileName()
Get the filename of the serialized Weka model to load/import from- Returns:
- the file name
-
setSavedModelFileName
public void setSavedModelFileName(String savedM)
Set the file name that the incrementally updated model will be saved to when the current stream of data terminates- Parameters:
savedM- the file name to save to
-
getSavedModelFileName
public String getSavedModelFileName()
Get the file name that the incrementally updated model will be saved to when the current stream of data terminates- Returns:
- the file name to save to
-
setModel
public void setModel(WekaScoringModel model)
Set the Weka model- Parameters:
model- aWekaScoringModelthat encapsulates the actual Weka model (Classifier or Clusterer)
-
getModel
public WekaScoringModel getModel()
Get the Weka model- Returns:
- a
WekaScoringModelthat encapsulates the actual Weka model (Classifier or Clusterer)
-
getDefaultModel
public WekaScoringModel getDefaultModel()
Gets the default model (only used when model file names are being sourced from a field in the incoming rows).- Returns:
- the default model to use when there is no filename provided in the incoming data row.
-
setDefaultModel
public void setDefaultModel(WekaScoringModel defaultM)
Sets the default model (only used when model file names are being sourced from a field in the incoming rows).- Parameters:
defaultM- the default model to use.
-
setOutputProbabilities
public void setOutputProbabilities(boolean b)
Set whether to predict probabilities- Parameters:
b- true if a probability distribution is to be output
-
getOutputProbabilities
public boolean getOutputProbabilities()
Get whether to predict probabilities- Returns:
- a true if a probability distribution is to be output
-
getUpdateIncrementalModel
public boolean getUpdateIncrementalModel()
Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).- Returns:
- a true if the model is to be updated incrementally with each incoming row
-
setUpdateIncrementalModel
public void setUpdateIncrementalModel(boolean u)
Set whether to update the model incrementally- Parameters:
u- true if the model should be updated with each incoming row (after predicting it)
-
getXML
protected String getXML(boolean logging)
-
getXML
public String getXML()
Return the XML describing this (configured) step- Specified by:
getXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- a
Stringcontaining the XML
-
equals
public boolean equals(Object obj)
Check for equality
-
hashCode
public int hashCode()
Hash code method
-
clone
public Object clone()
Clone this step's meta data- Specified by:
clonein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
clonein classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- the cloned meta data
-
setDefault
public void setDefault()
- Specified by:
setDefaultin interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLException
Loads the meta data for this (configured) step from XML.- Specified by:
loadXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
loadXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
stepnode- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException- if an error occurs
-
deSerializeBase64Model
protected void deSerializeBase64Model(String base64modelXML) throws Exception
- Throws:
Exception
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleExceptionRead this step's configuration from a repository- Specified by:
readRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
readRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to accessid_step- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleExceptionSave this step's meta data to a repository- Specified by:
saveRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
saveRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to save toid_transformation- transformation idid_step- step id- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepExceptionGenerates row meta data to represent the fields output by this step- Specified by:
getFieldsin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getFieldsin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
row- the meta data for the output producedorigin- the name of the step to be used as the origininfo- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException- if an error occurs
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
Check the settings of this step and put findings in a remarks list.- Specified by:
checkin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
checkin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
remarks- the list to put the remarks in. seeorg.pentaho.di.core.CheckResulttransmeta- the transform meta datastepMeta- the step meta dataprev- the fields coming from a previous stepinput- the input step namesoutput- the output step namesinfo- the fields that are used as information by the step
-
getDialogClassName
public String getDialogClassName()
- Specified by:
getDialogClassNamein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getDialogClassNamein classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)Get the executing step, needed by Trans to launch a step.- Specified by:
getStepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Parameters:
stepMeta- the step infostepDataInterface- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr- the copy number to get.tr- the transformation info.trans- the launching transformation- Returns:
- a
StepInterfacevalue
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()
Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepDatain interfaceorg.pentaho.di.trans.step.StepMetaInterface- Returns:
- a
StepDataInterfacevalue
-
-