Class KFMeta

java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.kf.KFMeta
All Implemented Interfaces:
Cloneable, org.pentaho.di.trans.step.StepAttributesInterface, org.pentaho.di.trans.step.StepMetaInterface

@Step(id="KF", image="KNWFL.svg", name="Knowledge Flow", description="Executes a Knowledge Flow data mining process", documentationUrl="https://pentaho-community.atlassian.net/wiki/display/EAI/Knowledge+Flow", categoryDescription="Data Mining") public class KFMeta extends org.pentaho.di.trans.step.BaseStepMeta implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the KF step.
Version:
$Revision$
Author:
Mark Hall (mhall{[at]}pentaho{[dot]}com)
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected org.pentaho.dm.commons.ArffMeta[]
    Meta data for the ARFF instances input to the inject step
    protected static Class<?>
     
    static final String
    XML tag for the KF step

    Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta

    attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected void
    allocate(int num)
    Allocate an array to hold meta data for the ARFF instances
    void
    check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
    Check the settings of this step and put findings in a remarks list.
    Clone this step's meta data
    boolean
    Check for equality
    protected String
    Get the name of the attribute to be set as the class attribute.
     
    void
    getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space)
    Generates row meta data to represent the fields output by this step
    protected String
    Get the knowledgeflow flow to be run.
    protected Vector<Vector<?>>
    getFlow(String xml, org.pentaho.di.core.variables.VariableSpace space)
    Load the flow (if we can).
    protected String
    Get the name of the event to use for injecting.
    protected org.pentaho.dm.commons.ArffMeta[]
    Get the meta data for the inject step
    protected weka.gui.beans.BeanInstance
    Return the inject step from the supplied flow (or null if not found).
    protected String
    Get the name of the step to inject data into.
    protected String
    Get the name of the event to use for output.
    protected String
    Get the name of the step to listen to for output.
    protected boolean
    Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)
    protected String
    Get the random seed to use for sampling.
    protected String
    Get the relation name to use for the sampled data.
    protected String
    Get the number of rows to randomly sample.
    protected String
    Get the file name of the serialized Weka flow to load/import from.
    protected boolean
    Get whether a class index is to be set in the sampled data.
    org.pentaho.di.trans.step.StepInterface
    getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
    Get the executing step, needed by Trans to launch a step.
    org.pentaho.di.trans.step.StepDataInterface
    Get a new instance of the appropriate data class.
    protected boolean
    Get whether to store the XML flow description as part of the step meta data.
    protected boolean
    Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.
    Return the XML describing this (configured) step
    boolean
    Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).
    void
    loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,org.pentaho.di.core.Counter> counters)
    Loads the meta data for this (configured) step from XML.
    void
    readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,org.pentaho.di.core.Counter> counters)
    Read this step's configuration from a repository
    void
    saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step)
    Save this step's meta data to a repository
    protected void
    Set the name of the attribute to be set as the class attribute.
    void
     
    protected void
    Set the actual knowledgeflow flows to run.
    protected void
    Set the name of the event to use for injecting.
    protected void
    setInjectFields(org.pentaho.dm.commons.ArffMeta[] am)
    Set the array of meta data for the inject step
    protected void
    Set the name of the step to inject data into.
    protected void
    Set the name of the event to use for output.
    protected void
    Set the name of the step to listen to for output.
    protected void
    setPassRowsThrough(boolean p)
    Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).
    protected void
    Set the random seed to use for sampling rows.
    protected void
    Set the relation name to use for the sampled data.
    protected void
    Set the number of rows to randomly sample.
    protected void
    Set the file name of the serialized Weka flow to load/import from.
    protected void
    setSetClass(boolean sc)
    Set whether to set a class index in the sampled data.
    protected void
    Set whether to store the XML flow description as part of the step meta data.
    protected void
    setStreamData(boolean sd)
    Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.
    protected void
    setUpMetaData(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row)
    Set up the outgoing row meta data from the supplied Instances object.

    Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta

    analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling

    Methods inherited from class java.lang.Object

    finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface

    analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
  • Field Details

    • PKG

      protected static Class<?> PKG
    • XML_TAG

      public static final String XML_TAG
      XML tag for the KF step
      See Also:
    • m_injectFields

      protected org.pentaho.dm.commons.ArffMeta[] m_injectFields
      Meta data for the ARFF instances input to the inject step
  • Constructor Details

    • KFMeta

      public KFMeta()
  • Method Details

    • setStoreFlowInStepMetaData

      protected void setStoreFlowInStepMetaData(boolean s)
      Set whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)
      Parameters:
      s - true if the flow should be stored in the step meta data
    • getStoreFlowInStepMetaData

      protected boolean getStoreFlowInStepMetaData()
      Get whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)
      Returns:
      true if the flow should be stored in the step meta data
    • setSampleRelationName

      protected void setSampleRelationName(String relationName)
      Set the relation name to use for the sampled data.
      Parameters:
      relationName - the relation name to use
    • getSampleRelationName

      protected String getSampleRelationName()
      Get the relation name to use for the sampled data.
      Returns:
      the relation name to use
    • getSampleSize

      protected String getSampleSize()
      Get the number of rows to randomly sample.
      Returns:
      the number of rows to sample
    • setSampleSize

      protected void setSampleSize(String size)
      Set the number of rows to randomly sample.
      Parameters:
      size - the number of rows to sample
    • getRandomSeed

      protected String getRandomSeed()
      Get the random seed to use for sampling.
      Returns:
      the random seed
    • setRandomSeed

      protected void setRandomSeed(String seed)
      Set the random seed to use for sampling rows.
      Parameters:
      seed - the seed to use
    • getPassRowsThrough

      protected boolean getPassRowsThrough()
      Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)
      Returns:
      true if rows are to be passed on to downstream kettle steps
    • setPassRowsThrough

      protected void setPassRowsThrough(boolean p)
      Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).
      Parameters:
      p - true if rows are to be passed on to downstream kettle steps
    • setSerializedFlowFileName

      protected void setSerializedFlowFileName(String fFile)
      Set the file name of the serialized Weka flow to load/import from.
      Parameters:
      fFile - the file name
    • getSerializedFlowFileName

      protected String getSerializedFlowFileName()
      Get the file name of the serialized Weka flow to load/import from.
      Returns:
      the file name of the serialized Weka flow
    • setFlow

      protected void setFlow(String flow)
      Set the actual knowledgeflow flows to run.
      Parameters:
      flow - the flows to run
    • getFlow

      protected String getFlow()
      Get the knowledgeflow flow to be run.
      Returns:
      the flow to be run
    • setInjectStepName

      protected void setInjectStepName(String isn)
      Set the name of the step to inject data into.
      Parameters:
      isn - the name of the step to inject data into
    • getInjectStepName

      protected String getInjectStepName()
      Get the name of the step to inject data into.
      Returns:
      the name of the step to inject data into
    • setInjectEventName

      protected void setInjectEventName(String ien)
      Set the name of the event to use for injecting.
      Parameters:
      ien - the name of the event to use for injecting
    • getInjectEventName

      protected String getInjectEventName()
      Get the name of the event to use for injecting.
      Returns:
      the name of the event to use for injecting
    • setOutputStepName

      protected void setOutputStepName(String osn)
      Set the name of the step to listen to for output.
      Parameters:
      osn - the name of the step to listen to for output
    • getOutputStepName

      protected String getOutputStepName()
      Get the name of the step to listen to for output.
      Returns:
      the name of the step to listen to for output
    • setOutputEventName

      protected void setOutputEventName(String oen)
      Set the name of the event to use for output.
      Parameters:
      oen - the name of the event to use for output
    • getOutputEventName

      protected String getOutputEventName()
      Get the name of the event to use for output.
      Returns:
      the name of the event to use for output
    • setSetClass

      protected void setSetClass(boolean sc)
      Set whether to set a class index in the sampled data.
      Parameters:
      sc - true if a class index is to be set in the data
    • getSetClass

      protected boolean getSetClass()
      Get whether a class index is to be set in the sampled data.
      Returns:
      true if a class index is to be set in the sampled data
    • setClassAttributeName

      protected void setClassAttributeName(String ca)
      Set the name of the attribute to be set as the class attribute.
      Parameters:
      ca - the name of the class attribute
    • getClassAttributeName

      protected String getClassAttributeName()
      Get the name of the attribute to be set as the class attribute.
      Returns:
      the name of the class attribute
    • setStreamData

      protected void setStreamData(boolean sd)
      Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.
      Parameters:
      sd - true if data should be streamed
    • getStreamData

      protected boolean getStreamData()
      Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.
      Returns:
      true if data is to be streamed
    • getInjectStep

      protected weka.gui.beans.BeanInstance getInjectStep(Vector flow)
      Return the inject step from the supplied flow (or null if not found).
      Parameters:
      flow - the flow to search
      Returns:
      the inject step or null if it is not in the flow
    • allocate

      protected void allocate(int num)
      Allocate an array to hold meta data for the ARFF instances
      Parameters:
      num - number of meta data objects to allocate
    • setInjectFields

      protected void setInjectFields(org.pentaho.dm.commons.ArffMeta[] am)
      Set the array of meta data for the inject step
      Parameters:
      am - an array of ArffMeta
    • getInjectFields

      protected org.pentaho.dm.commons.ArffMeta[] getInjectFields()
      Get the meta data for the inject step
      Returns:
      an array of ArffMeta
    • getXML

      public String getXML()
      Return the XML describing this (configured) step
      Specified by:
      getXML in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getXML in class org.pentaho.di.trans.step.BaseStepMeta
      Returns:
      a String containing the XML
    • loadXML

      public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLException
      Loads the meta data for this (configured) step from XML.
      Specified by:
      loadXML in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      loadXML in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      stepnode - the step to load
      Throws:
      org.pentaho.di.core.exception.KettleXMLException - if an error occurs
    • getFlow

      protected Vector<Vector<?>> getFlow(String xml, org.pentaho.di.core.variables.VariableSpace space) throws Exception
      Load the flow (if we can).
      Throws:
      Exception - if there is a problem loading the flow
    • readRep

      public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String,org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleException
      Read this step's configuration from a repository
      Specified by:
      readRep in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      readRep in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      rep - the repository to access
      id_step - the id for this step
      Throws:
      org.pentaho.di.core.exception.KettleException - if an error occurs
    • saveRep

      public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException
      Save this step's meta data to a repository
      Specified by:
      saveRep in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      saveRep in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      rep - the repository to save to
      id_transformation - transformation id
      id_step - step id
      Throws:
      org.pentaho.di.core.exception.KettleException - if an error occurs
    • setUpMetaData

      protected void setUpMetaData(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row)
      Set up the outgoing row meta data from the supplied Instances object.
      Parameters:
      insts - the Instances to use for setting up the outgoing row meta data
      row - holds the final outgoing row meta data
    • getFields

      public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException
      Generates row meta data to represent the fields output by this step
      Specified by:
      getFields in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getFields in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      row - the meta data for the output produced
      origin - the name of the step to be used as the origin
      info - The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.
      nextStep - if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.
      space - not sure what this is :-)
      Throws:
      org.pentaho.di.core.exception.KettleStepException - if an error occurs
    • isOutputStructureDetermined

      public boolean isOutputStructureDetermined()
      Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).
      Returns:
      true if the output structure has been determined.
    • equals

      public boolean equals(Object obj)
      Check for equality
      Overrides:
      equals in class Object
      Parameters:
      obj - an Object to compare with
      Returns:
      true if equal to the supplied object
    • clone

      public Object clone()
      Clone this step's meta data
      Specified by:
      clone in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      clone in class org.pentaho.di.trans.step.BaseStepMeta
      Returns:
      the cloned meta data
    • setDefault

      public void setDefault()
      Specified by:
      setDefault in interface org.pentaho.di.trans.step.StepMetaInterface
    • check

      public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info)
      Check the settings of this step and put findings in a remarks list.
      Specified by:
      check in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      check in class org.pentaho.di.trans.step.BaseStepMeta
      Parameters:
      remarks - the list to put the remarks in. see org.pentaho.di.core.CheckResult
      transmeta - the transform meta data
      stepMeta - the step meta data
      prev - the fields coming from a previous step
      input - the input step names
      output - the output step names
      info - the fields that are used as information by the step
    • getDialogClassName

      public String getDialogClassName()
      Specified by:
      getDialogClassName in interface org.pentaho.di.trans.step.StepMetaInterface
      Overrides:
      getDialogClassName in class org.pentaho.di.trans.step.BaseStepMeta
    • getStep

      public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans)
      Get the executing step, needed by Trans to launch a step.
      Specified by:
      getStep in interface org.pentaho.di.trans.step.StepMetaInterface
      Parameters:
      stepMeta - the step info
      stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
      cnr - the copy number to get.
      tr - the transformation info.
      trans - the launching transformation
      Returns:
      a StepInterface value
    • getStepData

      public org.pentaho.di.trans.step.StepDataInterface getStepData()
      Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.
      Specified by:
      getStepData in interface org.pentaho.di.trans.step.StepMetaInterface
      Returns:
      a StepDataInterface value