Class KFMeta

  • All Implemented Interfaces:
    Cloneable, org.pentaho.di.trans.step.StepAttributesInterface, org.pentaho.di.trans.step.StepMetaInterface

    @Step(id="KF",
          image="KNWFL.svg",
          name="Knowledge Flow",
          description="Executes a Knowledge Flow data mining process",
          documentationUrl="https://pentaho-community.atlassian.net/wiki/display/EAI/Knowledge+Flow",
          categoryDescription="Data Mining")
    public class KFMeta
    extends org.pentaho.di.trans.step.BaseStepMeta
    implements org.pentaho.di.trans.step.StepMetaInterface
    Contains the meta data for the KF step.
    Version:
    $Revision$
    Author:
    Mark Hall (mhall{[at]}pentaho{[dot]}com)
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected org.pentaho.dm.commons.ArffMeta[] m_injectFields
      Meta data for the ARFF instances input to the inject step
      protected static Class<?> PKG  
      static String XML_TAG
      XML tag for the KF step
      • Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta

        attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
    • Constructor Summary

      Constructors 
      Constructor Description
      KFMeta()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void allocate​(int num)
      Allocate an array to hold meta data for the ARFF instances
      void check​(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
      Check the settings of this step and put findings in a remarks list.
      Object clone()
      Clone this step's meta data
      boolean equals​(Object obj)
      Check for equality
      protected String getClassAttributeName()
      Get the name of the attribute to be set as the class attribute.
      String getDialogClassName()  
      void getFields​(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)
      Generates row meta data to represent the fields output by this step
      protected String getFlow()
      Get the knowledgeflow flow to be run.
      protected Vector<Vector<?>> getFlow​(String xml, org.pentaho.di.core.variables.VariableSpace space)
      Load the flow (if we can).
      protected String getInjectEventName()
      Get the name of the event to use for injecting.
      protected org.pentaho.dm.commons.ArffMeta[] getInjectFields()
      Get the meta data for the inject step
      protected weka.gui.beans.BeanInstance getInjectStep​(Vector flow)
      Return the inject step from the supplied flow (or null if not found).
      protected String getInjectStepName()
      Get the name of the step to inject data into.
      protected String getOutputEventName()
      Get the name of the event to use for output.
      protected String getOutputStepName()
      Get the name of the step to listen to for output.
      protected boolean getPassRowsThrough()
      Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)
      protected String getRandomSeed()
      Get the random seed to use for sampling.
      protected String getSampleRelationName()
      Get the relation name to use for the sampled data.
      protected String getSampleSize()
      Get the number of rows to randomly sample.
      protected String getSerializedFlowFileName()
      Get the file name of the serialized Weka flow to load/import from.
      protected boolean getSetClass()
      Get whether a class index is to be set in the sampled data.
      org.pentaho.di.trans.step.StepInterface getStep​(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
      Get the executing step, needed by Trans to launch a step.
      org.pentaho.di.trans.step.StepDataInterface getStepData()
      Get a new instance of the appropriate data class.
      protected boolean getStoreFlowInStepMetaData()
      Get whether to store the XML flow description as part of the step meta data.
      protected boolean getStreamData()
      Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.
      String getXML()
      Return the XML describing this (configured) step
      boolean isOutputStructureDetermined()
      Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).
      void loadXML​(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,​org.pentaho.di.core.Counter> counters)
      Loads the meta data for this (configured) step from XML.
      void readRep​(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,​org.pentaho.di.core.Counter> counters)
      Read this step's configuration from a repository
      void saveRep​(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)
      Save this step's meta data to a repository
      protected void setClassAttributeName​(String ca)
      Set the name of the attribute to be set as the class attribute.
      void setDefault()  
      protected void setFlow​(String flow)
      Set the actual knowledgeflow flows to run.
      protected void setInjectEventName​(String ien)
      Set the name of the event to use for injecting.
      protected void setInjectFields​(org.pentaho.dm.commons.ArffMeta[] am)
      Set the array of meta data for the inject step
      protected void setInjectStepName​(String isn)
      Set the name of the step to inject data into.
      protected void setOutputEventName​(String oen)
      Set the name of the event to use for output.
      protected void setOutputStepName​(String osn)
      Set the name of the step to listen to for output.
      protected void setPassRowsThrough​(boolean p)
      Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).
      protected void setRandomSeed​(String seed)
      Set the random seed to use for sampling rows.
      protected void setSampleRelationName​(String relationName)
      Set the relation name to use for the sampled data.
      protected void setSampleSize​(String size)
      Set the number of rows to randomly sample.
      protected void setSerializedFlowFileName​(String fFile)
      Set the file name of the serialized Weka flow to load/import from.
      protected void setSetClass​(boolean sc)
      Set whether to set a class index in the sampled data.
      protected void setStoreFlowInStepMetaData​(boolean s)
      Set whether to store the XML flow description as part of the step meta data.
      protected void setStreamData​(boolean sd)
      Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.
      protected void setUpMetaData​(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row)
      Set up the outgoing row meta data from the supplied Instances object.
      • Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta

        analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
      • Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface

        analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
    • Field Detail

      • PKG

        protected static Class<?> PKG
      • m_injectFields

        protected org.pentaho.dm.commons.ArffMeta[] m_injectFields
        Meta data for the ARFF instances input to the inject step
    • Constructor Detail

      • KFMeta

        public KFMeta()
    • Method Detail

      • setStoreFlowInStepMetaData

        protected void setStoreFlowInStepMetaData​(boolean s)
        Set whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)
        Parameters:
        s - true if the flow should be stored in the step meta data
      • getStoreFlowInStepMetaData

        protected boolean getStoreFlowInStepMetaData()
        Get whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)
        Returns:
        true if the flow should be stored in the step meta data
      • setSampleRelationName

        protected void setSampleRelationName​(String relationName)
        Set the relation name to use for the sampled data.
        Parameters:
        relationName - the relation name to use
      • getSampleRelationName

        protected String getSampleRelationName()
        Get the relation name to use for the sampled data.
        Returns:
        the relation name to use
      • getSampleSize

        protected String getSampleSize()
        Get the number of rows to randomly sample.
        Returns:
        the number of rows to sample
      • setSampleSize

        protected void setSampleSize​(String size)
        Set the number of rows to randomly sample.
        Parameters:
        size - the number of rows to sample
      • getRandomSeed

        protected String getRandomSeed()
        Get the random seed to use for sampling.
        Returns:
        the random seed
      • setRandomSeed

        protected void setRandomSeed​(String seed)
        Set the random seed to use for sampling rows.
        Parameters:
        seed - the seed to use
      • getPassRowsThrough

        protected boolean getPassRowsThrough()
        Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)
        Returns:
        true if rows are to be passed on to downstream kettle steps
      • setPassRowsThrough

        protected void setPassRowsThrough​(boolean p)
        Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).
        Parameters:
        p - true if rows are to be passed on to downstream kettle steps
      • setSerializedFlowFileName

        protected void setSerializedFlowFileName​(String fFile)
        Set the file name of the serialized Weka flow to load/import from.
        Parameters:
        fFile - the file name
      • getSerializedFlowFileName

        protected String getSerializedFlowFileName()
        Get the file name of the serialized Weka flow to load/import from.
        Returns:
        the file name of the serialized Weka flow
      • setFlow

        protected void setFlow​(String flow)
        Set the actual knowledgeflow flows to run.
        Parameters:
        flow - the flows to run
      • getFlow

        protected String getFlow()
        Get the knowledgeflow flow to be run.
        Returns:
        the flow to be run
      • setInjectStepName

        protected void setInjectStepName​(String isn)
        Set the name of the step to inject data into.
        Parameters:
        isn - the name of the step to inject data into
      • getInjectStepName

        protected String getInjectStepName()
        Get the name of the step to inject data into.
        Returns:
        the name of the step to inject data into
      • setInjectEventName

        protected void setInjectEventName​(String ien)
        Set the name of the event to use for injecting.
        Parameters:
        ien - the name of the event to use for injecting
      • getInjectEventName

        protected String getInjectEventName()
        Get the name of the event to use for injecting.
        Returns:
        the name of the event to use for injecting
      • setOutputStepName

        protected void setOutputStepName​(String osn)
        Set the name of the step to listen to for output.
        Parameters:
        osn - the name of the step to listen to for output
      • getOutputStepName

        protected String getOutputStepName()
        Get the name of the step to listen to for output.
        Returns:
        the name of the step to listen to for output
      • setOutputEventName

        protected void setOutputEventName​(String oen)
        Set the name of the event to use for output.
        Parameters:
        oen - the name of the event to use for output
      • getOutputEventName

        protected String getOutputEventName()
        Get the name of the event to use for output.
        Returns:
        the name of the event to use for output
      • setSetClass

        protected void setSetClass​(boolean sc)
        Set whether to set a class index in the sampled data.
        Parameters:
        sc - true if a class index is to be set in the data
      • getSetClass

        protected boolean getSetClass()
        Get whether a class index is to be set in the sampled data.
        Returns:
        true if a class index is to be set in the sampled data
      • setClassAttributeName

        protected void setClassAttributeName​(String ca)
        Set the name of the attribute to be set as the class attribute.
        Parameters:
        ca - the name of the class attribute
      • getClassAttributeName

        protected String getClassAttributeName()
        Get the name of the attribute to be set as the class attribute.
        Returns:
        the name of the class attribute
      • setStreamData

        protected void setStreamData​(boolean sd)
        Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.
        Parameters:
        sd - true if data should be streamed
      • getStreamData

        protected boolean getStreamData()
        Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.
        Returns:
        true if data is to be streamed
      • getInjectStep

        protected weka.gui.beans.BeanInstance getInjectStep​(Vector flow)
        Return the inject step from the supplied flow (or null if not found).
        Parameters:
        flow - the flow to search
        Returns:
        the inject step or null if it is not in the flow
      • allocate

        protected void allocate​(int num)
        Allocate an array to hold meta data for the ARFF instances
        Parameters:
        num - number of meta data objects to allocate
      • setInjectFields

        protected void setInjectFields​(org.pentaho.dm.commons.ArffMeta[] am)
        Set the array of meta data for the inject step
        Parameters:
        am - an array of ArffMeta
      • getInjectFields

        protected org.pentaho.dm.commons.ArffMeta[] getInjectFields()
        Get the meta data for the inject step
        Returns:
        an array of ArffMeta
      • getXML

        public String getXML()
        Return the XML describing this (configured) step
        Specified by:
        getXML in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getXML in class org.pentaho.di.trans.step.BaseStepMeta
        Returns:
        a String containing the XML
      • loadXML

        public void loadXML​(Node stepnode,
                            List<org.pentaho.di.core.database.DatabaseMeta> dbs,
                            Map<String,​org.pentaho.di.core.Counter> counters)
                     throws org.pentaho.di.core.exception.KettleXMLException
        Loads the meta data for this (configured) step from XML.
        Specified by:
        loadXML in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        loadXML in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        stepnode - the step to load
        Throws:
        org.pentaho.di.core.exception.KettleXMLException - if an error occurs
      • getFlow

        protected Vector<Vector<?>> getFlow​(String xml,
                                            org.pentaho.di.core.variables.VariableSpace space)
                                     throws Exception
        Load the flow (if we can).
        Throws:
        Exception - if there is a problem loading the flow
      • readRep

        public void readRep​(org.pentaho.di.repository.Repository rep,
                            org.pentaho.di.repository.ObjectId id_step,
                            List<org.pentaho.di.core.database.DatabaseMeta> dbs,
                            Map<String,​org.pentaho.di.core.Counter> counters)
                     throws org.pentaho.di.core.exception.KettleException
        Read this step's configuration from a repository
        Specified by:
        readRep in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        readRep in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        rep - the repository to access
        id_step - the id for this step
        Throws:
        org.pentaho.di.core.exception.KettleException - if an error occurs
      • saveRep

        public void saveRep​(org.pentaho.di.repository.Repository rep,
                            org.pentaho.di.repository.ObjectId id_transformation,
                            org.pentaho.di.repository.ObjectId id_step)
                     throws org.pentaho.di.core.exception.KettleException
        Save this step's meta data to a repository
        Specified by:
        saveRep in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        saveRep in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        rep - the repository to save to
        id_transformation - transformation id
        id_step - step id
        Throws:
        org.pentaho.di.core.exception.KettleException - if an error occurs
      • setUpMetaData

        protected void setUpMetaData​(weka.core.Instances insts,
                                     org.pentaho.di.core.row.RowMetaInterface row)
        Set up the outgoing row meta data from the supplied Instances object.
        Parameters:
        insts - the Instances to use for setting up the outgoing row meta data
        row - holds the final outgoing row meta data
      • getFields

        public void getFields​(org.pentaho.di.core.row.RowMetaInterface row,
                              String origin,
                              org.pentaho.di.core.row.RowMetaInterface[] info,
                              org.pentaho.di.trans.step.StepMeta nextStep,
                              org.pentaho.di.core.variables.VariableSpace space)
                       throws org.pentaho.di.core.exception.KettleStepException
        Generates row meta data to represent the fields output by this step
        Specified by:
        getFields in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getFields in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        row - the meta data for the output produced
        origin - the name of the step to be used as the origin
        info - The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.
        nextStep - if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.
        space - not sure what this is :-)
        Throws:
        org.pentaho.di.core.exception.KettleStepException - if an error occurs
      • isOutputStructureDetermined

        public boolean isOutputStructureDetermined()
        Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).
        Returns:
        true if the output structure has been determined.
      • equals

        public boolean equals​(Object obj)
        Check for equality
        Overrides:
        equals in class Object
        Parameters:
        obj - an Object to compare with
        Returns:
        true if equal to the supplied object
      • clone

        public Object clone()
        Clone this step's meta data
        Specified by:
        clone in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        clone in class org.pentaho.di.trans.step.BaseStepMeta
        Returns:
        the cloned meta data
      • setDefault

        public void setDefault()
        Specified by:
        setDefault in interface org.pentaho.di.trans.step.StepMetaInterface
      • check

        public void check​(List<org.pentaho.di.core.CheckResultInterface> remarks,
                          org.pentaho.di.trans.TransMeta transmeta,
                          org.pentaho.di.trans.step.StepMeta stepMeta,
                          org.pentaho.di.core.row.RowMetaInterface prev,
                          String[] input,
                          String[] output,
                          org.pentaho.di.core.row.RowMetaInterface info)
        Check the settings of this step and put findings in a remarks list.
        Specified by:
        check in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        check in class org.pentaho.di.trans.step.BaseStepMeta
        Parameters:
        remarks - the list to put the remarks in. see org.pentaho.di.core.CheckResult
        transmeta - the transform meta data
        stepMeta - the step meta data
        prev - the fields coming from a previous step
        input - the input step names
        output - the output step names
        info - the fields that are used as information by the step
      • getDialogClassName

        public String getDialogClassName()
        Specified by:
        getDialogClassName in interface org.pentaho.di.trans.step.StepMetaInterface
        Overrides:
        getDialogClassName in class org.pentaho.di.trans.step.BaseStepMeta
      • getStep

        public org.pentaho.di.trans.step.StepInterface getStep​(org.pentaho.di.trans.step.StepMeta stepMeta,
                                                               org.pentaho.di.trans.step.StepDataInterface stepDataInterface,
                                                               int cnr,
                                                               org.pentaho.di.trans.TransMeta tr,
                                                               org.pentaho.di.trans.Trans trans)
        Get the executing step, needed by Trans to launch a step.
        Specified by:
        getStep in interface org.pentaho.di.trans.step.StepMetaInterface
        Parameters:
        stepMeta - the step info
        stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
        cnr - the copy number to get.
        tr - the transformation info.
        trans - the launching transformation
        Returns:
        a StepInterface value
      • getStepData

        public org.pentaho.di.trans.step.StepDataInterface getStepData()
        Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.
        Specified by:
        getStepData in interface org.pentaho.di.trans.step.StepMetaInterface
        Returns:
        a StepDataInterface value