Package org.pentaho.di.scoring
Class WekaScoringMeta
- java.lang.Object
-
- org.pentaho.di.trans.step.BaseStepMeta
-
- org.pentaho.di.scoring.WekaScoringMeta
-
- All Implemented Interfaces:
Cloneable
,org.pentaho.di.trans.step.StepAttributesInterface
,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="WekaScoring", image="WEKAS.svg", name="Weka Scoring", description="Appends predictions from a pre-built Weka model", categoryDescription="Data Mining", documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin") public class WekaScoringMeta extends org.pentaho.di.trans.step.BaseStepMeta implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the WekaScoring step.- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}org)
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_BATCH_SCORING_SIZE
Batch scoring sizeprotected static Class<?>
PKG
static String
XML_TAG
-
Constructor Summary
Constructors Constructor Description WekaScoringMeta()
Creates a newWekaScoringMeta
instance.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
Check the settings of this step and put findings in a remarks list.Object
clone()
Clone this step's meta dataprotected void
deSerializeBase64Model(String base64modelXML)
boolean
equals(Object obj)
Check for equalityString
getBatchScoringSize()
Get the batch size to use if the model is a batch scoring modelboolean
getCacheLoadedModels()
Get whether to cache loaded models in memoryWekaScoringModel
getDefaultModel()
Gets the default model (only used when model file names are being sourced from a field in the incoming rows).String
getDialogClassName()
String
getFieldNameToLoadModelFrom()
Get the name of the incoming field that holds paths to model filesvoid
getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)
Generates row meta data to represent the fields output by this stepboolean
getFileNameFromField()
Get whether filename is coming from an incoming fieldWekaScoringModel
getModel()
Get the Weka modelboolean
getOutputProbabilities()
Get whether to predict probabilitiesString
getSavedModelFileName()
Get the file name that the incrementally updated model will be saved to when the current stream of data terminatesString
getSerializedModelFileName()
Get the filename of the serialized Weka model to load/import fromorg.pentaho.di.trans.step.StepInterface
getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterface
getStepData()
Get a new instance of the appropriate data class.boolean
getStoreModelInStepMetaData()
boolean
getUpdateIncrementalModel()
Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).String
getXML()
Return the XML describing this (configured) stepprotected String
getXML(boolean logging)
int
hashCode()
Hash code methodprotected void
loadModelFile()
void
loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)
Loads the meta data for this (configured) step from XML.void
readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)
Read this step's configuration from a repositoryvoid
saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)
Save this step's meta data to a repositoryvoid
setBatchScoringSize(String size)
Set the batch size to use if the model is a batch scoring modelvoid
setCacheLoadedModels(boolean l)
Set whether to cache loaded models in memoryvoid
setDefault()
void
setDefaultModel(WekaScoringModel defaultM)
Sets the default model (only used when model file names are being sourced from a field in the incoming rows).void
setFieldNameToLoadModelFrom(String fn)
Set the name of the incoming field that holds paths to model filesvoid
setFileNameFromField(boolean f)
Set whether filename is coming from an incoming fieldvoid
setModel(WekaScoringModel model)
Set the Weka modelvoid
setOutputProbabilities(boolean b)
Set whether to predict probabilitiesvoid
setSavedModelFileName(String savedM)
Set the file name that the incrementally updated model will be saved to when the current stream of data terminatesvoid
setSerializedModelFileName(String mfile)
Set the file name of the serialized Weka model to load/import fromvoid
setStoreModelInStepMetaData(boolean b)
void
setUpdateIncrementalModel(boolean u)
Set whether to update the model incrementally-
Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
-
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
-
-
-
Field Detail
-
PKG
protected static Class<?> PKG
-
XML_TAG
public static final String XML_TAG
- See Also:
- Constant Field Values
-
DEFAULT_BATCH_SCORING_SIZE
public static final int DEFAULT_BATCH_SCORING_SIZE
Batch scoring size- See Also:
- Constant Field Values
-
-
Method Detail
-
setStoreModelInStepMetaData
public void setStoreModelInStepMetaData(boolean b)
-
getStoreModelInStepMetaData
public boolean getStoreModelInStepMetaData()
-
setBatchScoringSize
public void setBatchScoringSize(String size)
Set the batch size to use if the model is a batch scoring model- Parameters:
size
- the size of the batch to use
-
getBatchScoringSize
public String getBatchScoringSize()
Get the batch size to use if the model is a batch scoring model- Returns:
- the size of the batch to use
-
setFileNameFromField
public void setFileNameFromField(boolean f)
Set whether filename is coming from an incoming field- Parameters:
f
- true if the model to use is specified via path in an incoming field value
-
getFileNameFromField
public boolean getFileNameFromField()
Get whether filename is coming from an incoming field- Returns:
- true if the model to use is specified via path in an incoming field value
-
setCacheLoadedModels
public void setCacheLoadedModels(boolean l)
Set whether to cache loaded models in memory- Parameters:
l
- true if models are to be cached in memory
-
getCacheLoadedModels
public boolean getCacheLoadedModels()
Get whether to cache loaded models in memory- Returns:
- true if models are to be cached in memory
-
setFieldNameToLoadModelFrom
public void setFieldNameToLoadModelFrom(String fn)
Set the name of the incoming field that holds paths to model files- Parameters:
fn
- the name of the incoming field that holds model paths
-
getFieldNameToLoadModelFrom
public String getFieldNameToLoadModelFrom()
Get the name of the incoming field that holds paths to model files- Returns:
- the name of the incoming field that holds model paths
-
setSerializedModelFileName
public void setSerializedModelFileName(String mfile)
Set the file name of the serialized Weka model to load/import from- Parameters:
mfile
- the file name
-
getSerializedModelFileName
public String getSerializedModelFileName()
Get the filename of the serialized Weka model to load/import from- Returns:
- the file name
-
setSavedModelFileName
public void setSavedModelFileName(String savedM)
Set the file name that the incrementally updated model will be saved to when the current stream of data terminates- Parameters:
savedM
- the file name to save to
-
getSavedModelFileName
public String getSavedModelFileName()
Get the file name that the incrementally updated model will be saved to when the current stream of data terminates- Returns:
- the file name to save to
-
setModel
public void setModel(WekaScoringModel model)
Set the Weka model- Parameters:
model
- aWekaScoringModel
that encapsulates the actual Weka model (Classifier or Clusterer)
-
getModel
public WekaScoringModel getModel()
Get the Weka model- Returns:
- a
WekaScoringModel
that encapsulates the actual Weka model (Classifier or Clusterer)
-
getDefaultModel
public WekaScoringModel getDefaultModel()
Gets the default model (only used when model file names are being sourced from a field in the incoming rows).- Returns:
- the default model to use when there is no filename provided in the incoming data row.
-
setDefaultModel
public void setDefaultModel(WekaScoringModel defaultM)
Sets the default model (only used when model file names are being sourced from a field in the incoming rows).- Parameters:
defaultM
- the default model to use.
-
setOutputProbabilities
public void setOutputProbabilities(boolean b)
Set whether to predict probabilities- Parameters:
b
- true if a probability distribution is to be output
-
getOutputProbabilities
public boolean getOutputProbabilities()
Get whether to predict probabilities- Returns:
- a true if a probability distribution is to be output
-
getUpdateIncrementalModel
public boolean getUpdateIncrementalModel()
Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).- Returns:
- a true if the model is to be updated incrementally with each incoming row
-
setUpdateIncrementalModel
public void setUpdateIncrementalModel(boolean u)
Set whether to update the model incrementally- Parameters:
u
- true if the model should be updated with each incoming row (after predicting it)
-
getXML
protected String getXML(boolean logging)
-
getXML
public String getXML()
Return the XML describing this (configured) step- Specified by:
getXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- a
String
containing the XML
-
equals
public boolean equals(Object obj)
Check for equality
-
hashCode
public int hashCode()
Hash code method
-
clone
public Object clone()
Clone this step's meta data- Specified by:
clone
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
clone
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- the cloned meta data
-
setDefault
public void setDefault()
- Specified by:
setDefault
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLException
Loads the meta data for this (configured) step from XML.- Specified by:
loadXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
loadXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
stepnode
- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException
- if an error occurs
-
deSerializeBase64Model
protected void deSerializeBase64Model(String base64modelXML) throws Exception
- Throws:
Exception
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleException
Read this step's configuration from a repository- Specified by:
readRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
readRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to accessid_step
- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException
Save this step's meta data to a repository- Specified by:
saveRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
saveRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to save toid_transformation
- transformation idid_step
- step id- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException
Generates row meta data to represent the fields output by this step- Specified by:
getFields
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getFields
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
row
- the meta data for the output producedorigin
- the name of the step to be used as the origininfo
- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep
- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space
- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException
- if an error occurs
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
Check the settings of this step and put findings in a remarks list.- Specified by:
check
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
check
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
remarks
- the list to put the remarks in. seeorg.pentaho.di.core.CheckResult
transmeta
- the transform meta datastepMeta
- the step meta dataprev
- the fields coming from a previous stepinput
- the input step namesoutput
- the output step namesinfo
- the fields that are used as information by the step
-
getDialogClassName
public String getDialogClassName()
- Specified by:
getDialogClassName
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getDialogClassName
in classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
Get the executing step, needed by Trans to launch a step.- Specified by:
getStep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Parameters:
stepMeta
- the step infostepDataInterface
- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr
- the copy number to get.tr
- the transformation info.trans
- the launching transformation- Returns:
- a
StepInterface
value
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()
Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepData
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Returns:
- a
StepDataInterface
value
-
-