Class WekaScoringMeta

  • All Implemented Interfaces:
    Cloneable, org.pentaho.di.trans.step.StepAttributesInterface, org.pentaho.di.trans.step.StepMetaInterface

    @Step(id="WekaScoring",
          image="WEKAS.svg",
          name="Weka Scoring",
          description="Appends predictions from a pre-built Weka model",
          categoryDescription="Data Mining",
          documentationUrl="https://pentaho-community.atlassian.net/wiki/display/DATAMINING/Using+the+Weka+Scoring+Plugin")
    public class WekaScoringMeta
    extends org.pentaho.di.trans.step.BaseStepMeta
    implements org.pentaho.di.trans.step.StepMetaInterface
    Contains the meta data for the WekaScoring step.
    Author:
    Mark Hall (mhall{[at]}pentaho{[dot]}org)
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int DEFAULT_BATCH_SCORING_SIZE
      Batch scoring size
      protected static Class<?> PKG  
      static String XML_TAG  
      • Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta

        attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
    • Constructor Summary

      Constructors 
      Constructor Description
      WekaScoringMeta()
      Creates a new WekaScoringMeta instance.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void check​(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
      Check the settings of this step and put findings in a remarks list.
      Object clone()
      Clone this step's meta data
      protected void deSerializeBase64Model​(String base64modelXML)  
      boolean equals​(Object obj)
      Check for equality
      String getBatchScoringSize()
      Get the batch size to use if the model is a batch scoring model
      boolean getCacheLoadedModels()
      Get whether to cache loaded models in memory
      WekaScoringModel getDefaultModel()
      Gets the default model (only used when model file names are being sourced from a field in the incoming rows).
      String getDialogClassName()  
      String getFieldNameToLoadModelFrom()
      Get the name of the incoming field that holds paths to model files
      void getFields​(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)
      Generates row meta data to represent the fields output by this step
      boolean getFileNameFromField()
      Get whether filename is coming from an incoming field
      WekaScoringModel getModel()
      Get the Weka model
      boolean getOutputProbabilities()
      Get whether to predict probabilities
      String getSavedModelFileName()
      Get the file name that the incrementally updated model will be saved to when the current stream of data terminates
      String getSerializedModelFileName()
      Get the filename of the serialized Weka model to load/import from
      org.pentaho.di.trans.step.StepInterface getStep​(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
      Get the executing step, needed by Trans to launch a step.
      org.pentaho.di.trans.step.StepDataInterface getStepData()
      Get a new instance of the appropriate data class.
      boolean getStoreModelInStepMetaData()  
      boolean getUpdateIncrementalModel()
      Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).
      String getXML()
      Return the XML describing this (configured) step
      protected String getXML​(boolean logging)  
      int hashCode()
      Hash code method
      protected void loadModelFile()  
      void loadXML​(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,​org.pentaho.di.core.Counter> counters)
      Loads the meta data for this (configured) step from XML.
      void readRep​(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> databases, Map<String,​org.pentaho.di.core.Counter> counters)
      Read this step's configuration from a repository
      void saveRep​(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)
      Save this step's meta data to a repository
      void setBatchScoringSize​(String size)
      Set the batch size to use if the model is a batch scoring model
      void setCacheLoadedModels​(boolean l)
      Set whether to cache loaded models in memory
      void setDefault()  
      void setDefaultModel​(WekaScoringModel defaultM)
      Sets the default model (only used when model file names are being sourced from a field in the incoming rows).
      void setFieldNameToLoadModelFrom​(String fn)
      Set the name of the incoming field that holds paths to model files
      void setFileNameFromField​(boolean f)
      Set whether filename is coming from an incoming field
      void setModel​(WekaScoringModel model)
      Set the Weka model
      void setOutputProbabilities​(boolean b)
      Set whether to predict probabilities
      void setSavedModelFileName​(String savedM)
      Set the file name that the incrementally updated model will be saved to when the current stream of data terminates
      void setSerializedModelFileName​(String mfile)
      Set the file name of the serialized Weka model to load/import from
      void setStoreModelInStepMetaData​(boolean b)  
      void setUpdateIncrementalModel​(boolean u)
      Set whether to update the model incrementally
      • Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta

        analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
      • Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface

        analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
    • Constructor Detail

      • WekaScoringMeta

        public WekaScoringMeta()
        Creates a new WekaScoringMeta instance.
    • Method Detail

      • setStoreModelInStepMetaData

        public void setStoreModelInStepMetaData​(boolean b)
      • getStoreModelInStepMetaData

        public boolean getStoreModelInStepMetaData()
      • setBatchScoringSize

        public void setBatchScoringSize​(String size)
        Set the batch size to use if the model is a batch scoring model
        Parameters:
        size - the size of the batch to use
      • getBatchScoringSize

        public String getBatchScoringSize()
        Get the batch size to use if the model is a batch scoring model
        Returns:
        the size of the batch to use
      • setFileNameFromField

        public void setFileNameFromField​(boolean f)
        Set whether filename is coming from an incoming field
        Parameters:
        f - true if the model to use is specified via path in an incoming field value
      • getFileNameFromField

        public boolean getFileNameFromField()
        Get whether filename is coming from an incoming field
        Returns:
        true if the model to use is specified via path in an incoming field value
      • setCacheLoadedModels

        public void setCacheLoadedModels​(boolean l)
        Set whether to cache loaded models in memory
        Parameters:
        l - true if models are to be cached in memory
      • getCacheLoadedModels

        public boolean getCacheLoadedModels()
        Get whether to cache loaded models in memory
        Returns:
        true if models are to be cached in memory
      • setFieldNameToLoadModelFrom

        public void setFieldNameToLoadModelFrom​(String fn)
        Set the name of the incoming field that holds paths to model files
        Parameters:
        fn - the name of the incoming field that holds model paths
      • getFieldNameToLoadModelFrom

        public String getFieldNameToLoadModelFrom()
        Get the name of the incoming field that holds paths to model files
        Returns:
        the name of the incoming field that holds model paths
      • setSerializedModelFileName

        public void setSerializedModelFileName​(String mfile)
        Set the file name of the serialized Weka model to load/import from
        Parameters:
        mfile - the file name
      • getSerializedModelFileName

        public String getSerializedModelFileName()
        Get the filename of the serialized Weka model to load/import from
        Returns:
        the file name
      • setSavedModelFileName

        public void setSavedModelFileName​(String savedM)
        Set the file name that the incrementally updated model will be saved to when the current stream of data terminates
        Parameters:
        savedM - the file name to save to
      • getSavedModelFileName

        public String getSavedModelFileName()
        Get the file name that the incrementally updated model will be saved to when the current stream of data terminates
        Returns:
        the file name to save to
      • setModel

        public void setModel​(WekaScoringModel model)
        Set the Weka model
        Parameters:
        model - a WekaScoringModel that encapsulates the actual Weka model (Classifier or Clusterer)
      • getModel

        public WekaScoringModel getModel()
        Get the Weka model
        Returns:
        a WekaScoringModel that encapsulates the actual Weka model (Classifier or Clusterer)
      • getDefaultModel

        public WekaScoringModel getDefaultModel()
        Gets the default model (only used when model file names are being sourced from a field in the incoming rows).
        Returns:
        the default model to use when there is no filename provided in the incoming data row.
      • setDefaultModel

        public void setDefaultModel​(WekaScoringModel defaultM)
        Sets the default model (only used when model file names are being sourced from a field in the incoming rows).
        Parameters:
        defaultM - the default model to use.
      • setOutputProbabilities

        public void setOutputProbabilities​(boolean b)
        Set whether to predict probabilities
        Parameters:
        b - true if a probability distribution is to be output
      • getOutputProbabilities

        public boolean getOutputProbabilities()
        Get whether to predict probabilities
        Returns:
        a true if a probability distribution is to be output
      • getUpdateIncrementalModel

        public boolean getUpdateIncrementalModel()
        Get whether the model is to be incrementally updated with each incoming row (after making a prediction for it).
        Returns:
        a true if the model is to be updated incrementally with each incoming row
      • setUpdateIncrementalModel

        public void setUpdateIncrementalModel​(boolean u)
        Set whether to update the model incrementally
        Parameters:
        u - true if the model should be updated with each incoming row (after predicting it)
      • getXML

        protected String getXML​(boolean logging)
      • getXML

        public String getXML()
        Return the XML describing this (configured) step
        Specified by:
        getXML in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getXML in class org.pentaho.di.trans.step.BaseStepMeta
        Returns:
        a String containing the XML
      • equals

        public boolean equals​(Object obj)
        Check for equality
        Overrides:
        equals in class Object
        Parameters:
        obj - an Object to compare with
        Returns:
        true if equal to the supplied object
      • hashCode

        public int hashCode()
        Hash code method
        Overrides:
        hashCode in class Object
        Returns:
        the hash code for this object
      • clone

        public Object clone()
        Clone this step's meta data
        Specified by:
        clone in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        clone in class org.pentaho.di.trans.step.BaseStepMeta
        Returns:
        the cloned meta data
      • setDefault

        public void setDefault()
        Specified by:
        setDefault in interface org.pentaho.di.trans.step.StepMetaInterface
      • loadXML

        public void loadXML​(Node stepnode,
                            List<org.pentaho.di.core.database.DatabaseMeta> databases,
                            Map<String,​org.pentaho.di.core.Counter> counters)
                     throws org.pentaho.di.core.exception.KettleXMLException
        Loads the meta data for this (configured) step from XML.
        Specified by:
        loadXML in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        loadXML in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        stepnode - the step to load
        Throws:
        org.pentaho.di.core.exception.KettleXMLException - if an error occurs
      • deSerializeBase64Model

        protected void deSerializeBase64Model​(String base64modelXML)
                                       throws Exception
        Throws:
        Exception
      • readRep

        public void readRep​(org.pentaho.di.repository.Repository rep,
                            org.pentaho.di.repository.ObjectId id_step,
                            List<org.pentaho.di.core.database.DatabaseMeta> databases,
                            Map<String,​org.pentaho.di.core.Counter> counters)
                     throws org.pentaho.di.core.exception.KettleException
        Read this step's configuration from a repository
        Specified by:
        readRep in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        readRep in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        rep - the repository to access
        id_step - the id for this step
        Throws:
        org.pentaho.di.core.exception.KettleException - if an error occurs
      • saveRep

        public void saveRep​(org.pentaho.di.repository.Repository rep,
                            org.pentaho.di.repository.ObjectId id_transformation,
                            org.pentaho.di.repository.ObjectId id_step)
                     throws org.pentaho.di.core.exception.KettleException
        Save this step's meta data to a repository
        Specified by:
        saveRep in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        saveRep in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        rep - the repository to save to
        id_transformation - transformation id
        id_step - step id
        Throws:
        org.pentaho.di.core.exception.KettleException - if an error occurs
      • getFields

        public void getFields​(org.pentaho.di.core.row.RowMetaInterface row,
                              String origin,
                              org.pentaho.di.core.row.RowMetaInterface[] info,
                              org.pentaho.di.trans.step.StepMeta nextStep,
                              org.pentaho.di.core.variables.VariableSpace space)
                       throws org.pentaho.di.core.exception.KettleStepException
        Generates row meta data to represent the fields output by this step
        Specified by:
        getFields in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getFields in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        row - the meta data for the output produced
        origin - the name of the step to be used as the origin
        info - The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.
        nextStep - if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.
        space - not sure what this is :-)
        Throws:
        org.pentaho.di.core.exception.KettleStepException - if an error occurs
      • check

        public void check​(List<org.pentaho.di.core.CheckResultInterface> remarks,
                          org.pentaho.di.trans.TransMeta transmeta,
                          org.pentaho.di.trans.step.StepMeta stepMeta,
                          org.pentaho.di.core.row.RowMetaInterface prev,
                          String[] input,
                          String[] output,
                          org.pentaho.di.core.row.RowMetaInterface info)
        Check the settings of this step and put findings in a remarks list.
        Specified by:
        check in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        check in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        remarks - the list to put the remarks in. see org.pentaho.di.core.CheckResult
        transmeta - the transform meta data
        stepMeta - the step meta data
        prev - the fields coming from a previous step
        input - the input step names
        output - the output step names
        info - the fields that are used as information by the step
      • getDialogClassName

        public String getDialogClassName()
        Specified by:
        getDialogClassName in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getDialogClassName in class org.pentaho.di.trans.step.BaseStepMeta
      • getStep

        public org.pentaho.di.trans.step.StepInterface getStep​(org.pentaho.di.trans.step.StepMeta stepMeta,
                                                               org.pentaho.di.trans.step.StepDataInterface stepDataInterface,
                                                               int cnr,
                                                               org.pentaho.di.trans.TransMeta tr,
                                                               org.pentaho.di.trans.Trans trans)
        Get the executing step, needed by Trans to launch a step.
        Specified by:
        getStep in interface org.pentaho.di.trans.step.StepMetaInterface
        Parameters:
        stepMeta - the step info
        stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
        cnr - the copy number to get.
        tr - the transformation info.
        trans - the launching transformation
        Returns:
        a StepInterface value
      • getStepData

        public org.pentaho.di.trans.step.StepDataInterface getStepData()
        Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.
        Specified by:
        getStepData in interface org.pentaho.di.trans.step.StepMetaInterface
        Returns:
        a StepDataInterface value