Class WekaScoringMeta

java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.scoring.WekaScoringMeta
All Implemented Interfaces:
Cloneable, org.pentaho.di.trans.step.StepAttributesInterface, org.pentaho.di.trans.step.StepMetaInterface

@Step(id="WekaScoring", image="WEKAS.svg", name="Weka Scoring", description="Appends predictions from a pre-built Weka model", categoryDescription="Data Mining", documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin") public class WekaScoringMeta extends org.pentaho.di.trans.step.BaseStepMeta implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the WekaScoring step.
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}org)
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Batch scoring size
    protected static Class<?>
     
    static final String
     

    Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta

    attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates a new WekaScoringMeta instance.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
    Check the settings of this step and put findings in a remarks list.
    Clone this step's meta data
    protected void
     
    boolean
    Check for equality
    Get the batch size to use if the model is a batch scoring model
    boolean
    Get whether to cache loaded models in memory
    Gets the default model (only used when model file names are being sourced from a field in the incoming rows).
     
    Get the name of the incoming field that holds paths to model files
    void
    getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)
    Generates row meta data to represent the fields output by this step
    boolean
    Get whether filename is coming from an incoming field
    Get the Weka model
    boolean
    Get whether to predict probabilities
    Get the file name that the incrementally updated model will be saved to when the current stream of data terminates
    Get the filename of the serialized Weka model to load/import from
    org.pentaho.di.trans.step.StepInterface
    getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
    Get the executing step, needed by Trans to launch a step.
    org.pentaho.di.trans.step.StepDataInterface
    Get a new instance of the appropriate data class.
    boolean
     
    boolean
    Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).
    Return the XML describing this (configured) step
    protected String
    getXML(boolean logging)
     
    int
    Hash code method
    protected void
     
    void
    loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)
    Loads the meta data for this (configured) step from XML.
    void
    readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters)
    Read this step's configuration from a repository
    void
    saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)
    Save this step's meta data to a repository
    void
    Set the batch size to use if the model is a batch scoring model
    void
    Set whether to cache loaded models in memory
    void
     
    void
    Sets the default model (only used when model file names are being sourced from a field in the incoming rows).
    void
    Set the name of the incoming field that holds paths to model files
    void
    Set whether filename is coming from an incoming field
    void
    Set the Weka model
    void
    Set whether to predict probabilities
    void
    Set the file name that the incrementally updated model will be saved to when the current stream of data terminates
    void
    Set the file name of the serialized Weka model to load/import from
    void
     
    void
    Set whether to update the model incrementally

    Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta

    analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling

    Methods inherited from class java.lang.Object

    finalize, getClass, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface

    analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
  • Field Details

  • Constructor Details

    • WekaScoringMeta

      public WekaScoringMeta()
      Creates a new WekaScoringMeta instance.
  • Method Details

    • setStoreModelInStepMetaData

      public void setStoreModelInStepMetaData(boolean b)
    • getStoreModelInStepMetaData

      public boolean getStoreModelInStepMetaData()
    • setBatchScoringSize

      public void setBatchScoringSize(String size)
      Set the batch size to use if the model is a batch scoring model
      Parameters:
      size - the size of the batch to use
    • getBatchScoringSize

      public String getBatchScoringSize()
      Get the batch size to use if the model is a batch scoring model
      Returns:
      the size of the batch to use
    • setFileNameFromField

      public void setFileNameFromField(boolean f)
      Set whether filename is coming from an incoming field
      Parameters:
      f - true if the model to use is specified via path in an incoming field value
    • getFileNameFromField

      public boolean getFileNameFromField()
      Get whether filename is coming from an incoming field
      Returns:
      true if the model to use is specified via path in an incoming field value
    • setCacheLoadedModels

      public void setCacheLoadedModels(boolean l)
      Set whether to cache loaded models in memory
      Parameters:
      l - true if models are to be cached in memory
    • getCacheLoadedModels

      public boolean getCacheLoadedModels()
      Get whether to cache loaded models in memory
      Returns:
      true if models are to be cached in memory
    • setFieldNameToLoadModelFrom

      public void setFieldNameToLoadModelFrom(String fn)
      Set the name of the incoming field that holds paths to model files
      Parameters:
      fn - the name of the incoming field that holds model paths
    • getFieldNameToLoadModelFrom

      public String getFieldNameToLoadModelFrom()
      Get the name of the incoming field that holds paths to model files
      Returns:
      the name of the incoming field that holds model paths
    • setSerializedModelFileName

      public void setSerializedModelFileName(String mfile)
      Set the file name of the serialized Weka model to load/import from
      Parameters:
      mfile - the file name
    • getSerializedModelFileName

      public String getSerializedModelFileName()
      Get the filename of the serialized Weka model to load/import from
      Returns:
      the file name
    • setSavedModelFileName

      public void setSavedModelFileName(String savedM)
      Set the file name that the incrementally updated model will be saved to when the current stream of data terminates
      Parameters:
      savedM - the file name to save to
    • getSavedModelFileName

      public String getSavedModelFileName()
      Get the file name that the incrementally updated model will be saved to when the current stream of data terminates
      Returns:
      the file name to save to
    • setModel

      public void setModel(WekaScoringModel model)
      Set the Weka model
      Parameters:
      model - a WekaScoringModel that encapsulates the actual Weka model (Classifier or Clusterer)
    • getModel

      public WekaScoringModel getModel()
      Get the Weka model
      Returns:
      a WekaScoringModel that encapsulates the actual Weka model (Classifier or Clusterer)
    • getDefaultModel

      public WekaScoringModel getDefaultModel()
      Gets the default model (only used when model file names are being sourced from a field in the incoming rows).
      Returns:
      the default model to use when there is no filename provided in the incoming data row.
    • setDefaultModel

      public void setDefaultModel(WekaScoringModel defaultM)
      Sets the default model (only used when model file names are being sourced from a field in the incoming rows).
      Parameters:
      defaultM - the default model to use.
    • setOutputProbabilities

      public void setOutputProbabilities(boolean b)
      Set whether to predict probabilities
      Parameters:
      b - true if a probability distribution is to be output
    • getOutputProbabilities

      public boolean getOutputProbabilities()
      Get whether to predict probabilities
      Returns:
      a true if a probability distribution is to be output
    • getUpdateIncrementalModel

      public boolean getUpdateIncrementalModel()
      Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).
      Returns:
      a true if the model is to be updated incrementally with each incoming row
    • setUpdateIncrementalModel

      public void setUpdateIncrementalModel(boolean u)
      Set whether to update the model incrementally
      Parameters:
      u - true if the model should be updated with each incoming row (after predicting it)
    • getXML

      protected String getXML(boolean logging)
    • getXML

      public String getXML()
      Return the XML describing this (configured) step
      Specified by:
      getXML in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getXML in class org.pentaho.di.trans.step.BaseStepMeta
      Returns:
      a String containing the XML
    • equals

      public boolean equals(Object obj)
      Check for equality
      Overrides:
      equals in class Object
      Parameters:
      obj - an Object to compare with
      Returns:
      true if equal to the supplied object
    • hashCode

      public int hashCode()
      Hash code method
      Overrides:
      hashCode in class Object
      Returns:
      the hash code for this object
    • clone

      public Object clone()
      Clone this step's meta data
      Specified by:
      clone in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      clone in class org.pentaho.di.trans.step.BaseStepMeta
      Returns:
      the cloned meta data
    • setDefault

      public void setDefault()
      Specified by:
      setDefault in interface org.pentaho.di.trans.step.StepMetaInterface
    • loadXML

      public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLException
      Loads the meta data for this (configured) step from XML.
      Specified by:
      loadXML in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      loadXML in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      stepnode - the step to load
      Throws:
      org.pentaho.di.core.exception.KettleXMLException - if an error occurs
    • loadModelFile

      protected void loadModelFile() throws Exception
      Throws:
      Exception
    • deSerializeBase64Model

      protected void deSerializeBase64Model(String base64modelXML) throws Exception
      Throws:
      Exception
    • readRep

      public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleException
      Read this step's configuration from a repository
      Specified by:
      readRep in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      readRep in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      rep - the repository to access
      id_step - the id for this step
      Throws:
      org.pentaho.di.core.exception.KettleException - if an error occurs
    • saveRep

      public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException
      Save this step's meta data to a repository
      Specified by:
      saveRep in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      saveRep in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      rep - the repository to save to
      id_transformation - transformation id
      id_step - step id
      Throws:
      org.pentaho.di.core.exception.KettleException - if an error occurs
    • getFields

      public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException
      Generates row meta data to represent the fields output by this step
      Specified by:
      getFields in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getFields in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      row - the meta data for the output produced
      origin - the name of the step to be used as the origin
      info - The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.
      nextStep - if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.
      space - not sure what this is :-)
      Throws:
      org.pentaho.di.core.exception.KettleStepException - if an error occurs
    • check

      public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
      Check the settings of this step and put findings in a remarks list.
      Specified by:
      check in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      check in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      remarks - the list to put the remarks in. see org.pentaho.di.core.CheckResult
      transmeta - the transform meta data
      stepMeta - the step meta data
      prev - the fields coming from a previous step
      input - the input step names
      output - the output step names
      info - the fields that are used as information by the step
    • getDialogClassName

      public String getDialogClassName()
      Specified by:
      getDialogClassName in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getDialogClassName in class org.pentaho.di.trans.step.BaseStepMeta
    • getStep

      public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
      Get the executing step, needed by Trans to launch a step.
      Specified by:
      getStep in interface org.pentaho.di.trans.step.StepMetaInterface
      Parameters:
      stepMeta - the step info
      stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
      cnr - the copy number to get.
      tr - the transformation info.
      trans - the launching transformation
      Returns:
      a StepInterface value
    • getStepData

      public org.pentaho.di.trans.step.StepDataInterface getStepData()
      Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.
      Specified by:
      getStepData in interface org.pentaho.di.trans.step.StepMetaInterface
      Returns:
      a StepDataInterface value