Package org.pentaho.di.kf
Class KFMeta
java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.kf.KFMeta
- All Implemented Interfaces:
Cloneable
,org.pentaho.di.trans.step.StepAttributesInterface
,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="KF",
image="KNWFL.svg",
name="Knowledge Flow",
description="Executes a Knowledge Flow data mining process",
documentationUrl="https://pentaho-community.atlassian.net/wiki/display/EAI/Knowledge+Flow",
categoryDescription="Data Mining")
public class KFMeta
extends org.pentaho.di.trans.step.BaseStepMeta
implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the KF step.
- Version:
- $Revision$
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
-
Field Summary
Modifier and TypeFieldDescriptionprotected org.pentaho.dm.commons.ArffMeta[]
Meta data for the ARFF instances input to the inject stepprotected static Class<?>
static final String
XML tag for the KF stepFields inherited from class org.pentaho.di.trans.step.BaseStepMeta
attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected void
allocate
(int num) Allocate an array to hold meta data for the ARFF instancesvoid
check
(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.clone()
Clone this step's meta databoolean
Check for equalityprotected String
Get the name of the attribute to be set as the class attribute.void
getFields
(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) Generates row meta data to represent the fields output by this stepprotected String
getFlow()
Get the knowledgeflow flow to be run.Load the flow (if we can).protected String
Get the name of the event to use for injecting.protected org.pentaho.dm.commons.ArffMeta[]
Get the meta data for the inject stepprotected weka.gui.beans.BeanInstance
getInjectStep
(Vector flow) Return the inject step from the supplied flow (or null if not found).protected String
Get the name of the step to inject data into.protected String
Get the name of the event to use for output.protected String
Get the name of the step to listen to for output.protected boolean
Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)protected String
Get the random seed to use for sampling.protected String
Get the relation name to use for the sampled data.protected String
Get the number of rows to randomly sample.protected String
Get the file name of the serialized Weka flow to load/import from.protected boolean
Get whether a class index is to be set in the sampled data.org.pentaho.di.trans.step.StepInterface
getStep
(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterface
Get a new instance of the appropriate data class.protected boolean
Get whether to store the XML flow description as part of the step meta data.protected boolean
Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.getXML()
Return the XML describing this (configured) stepboolean
Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).void
loadXML
(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) Loads the meta data for this (configured) step from XML.void
readRep
(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) Read this step's configuration from a repositoryvoid
saveRep
(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) Save this step's meta data to a repositoryprotected void
Set the name of the attribute to be set as the class attribute.void
protected void
Set the actual knowledgeflow flows to run.protected void
setInjectEventName
(String ien) Set the name of the event to use for injecting.protected void
setInjectFields
(org.pentaho.dm.commons.ArffMeta[] am) Set the array of meta data for the inject stepprotected void
setInjectStepName
(String isn) Set the name of the step to inject data into.protected void
setOutputEventName
(String oen) Set the name of the event to use for output.protected void
setOutputStepName
(String osn) Set the name of the step to listen to for output.protected void
setPassRowsThrough
(boolean p) Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).protected void
setRandomSeed
(String seed) Set the random seed to use for sampling rows.protected void
setSampleRelationName
(String relationName) Set the relation name to use for the sampled data.protected void
setSampleSize
(String size) Set the number of rows to randomly sample.protected void
setSerializedFlowFileName
(String fFile) Set the file name of the serialized Weka flow to load/import from.protected void
setSetClass
(boolean sc) Set whether to set a class index in the sampled data.protected void
setStoreFlowInStepMetaData
(boolean s) Set whether to store the XML flow description as part of the step meta data.protected void
setStreamData
(boolean sd) Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.protected void
setUpMetaData
(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row) Set up the outgoing row meta data from the supplied Instances object.Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandling
Methods inherited from class java.lang.Object
finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
Field Details
-
PKG
-
XML_TAG
XML tag for the KF step- See Also:
-
m_injectFields
protected org.pentaho.dm.commons.ArffMeta[] m_injectFieldsMeta data for the ARFF instances input to the inject step
-
-
Constructor Details
-
KFMeta
public KFMeta()
-
-
Method Details
-
setStoreFlowInStepMetaData
protected void setStoreFlowInStepMetaData(boolean s) Set whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)- Parameters:
s
- true if the flow should be stored in the step meta data
-
getStoreFlowInStepMetaData
protected boolean getStoreFlowInStepMetaData()Get whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)- Returns:
- true if the flow should be stored in the step meta data
-
setSampleRelationName
Set the relation name to use for the sampled data.- Parameters:
relationName
- the relation name to use
-
getSampleRelationName
Get the relation name to use for the sampled data.- Returns:
- the relation name to use
-
getSampleSize
Get the number of rows to randomly sample.- Returns:
- the number of rows to sample
-
setSampleSize
Set the number of rows to randomly sample.- Parameters:
size
- the number of rows to sample
-
getRandomSeed
Get the random seed to use for sampling.- Returns:
- the random seed
-
setRandomSeed
Set the random seed to use for sampling rows.- Parameters:
seed
- the seed to use
-
getPassRowsThrough
protected boolean getPassRowsThrough()Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)- Returns:
- true if rows are to be passed on to downstream kettle steps
-
setPassRowsThrough
protected void setPassRowsThrough(boolean p) Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).- Parameters:
p
- true if rows are to be passed on to downstream kettle steps
-
setSerializedFlowFileName
Set the file name of the serialized Weka flow to load/import from.- Parameters:
fFile
- the file name
-
getSerializedFlowFileName
Get the file name of the serialized Weka flow to load/import from.- Returns:
- the file name of the serialized Weka flow
-
setFlow
Set the actual knowledgeflow flows to run.- Parameters:
flow
- the flows to run
-
getFlow
Get the knowledgeflow flow to be run.- Returns:
- the flow to be run
-
setInjectStepName
Set the name of the step to inject data into.- Parameters:
isn
- the name of the step to inject data into
-
getInjectStepName
Get the name of the step to inject data into.- Returns:
- the name of the step to inject data into
-
setInjectEventName
Set the name of the event to use for injecting.- Parameters:
ien
- the name of the event to use for injecting
-
getInjectEventName
Get the name of the event to use for injecting.- Returns:
- the name of the event to use for injecting
-
setOutputStepName
Set the name of the step to listen to for output.- Parameters:
osn
- the name of the step to listen to for output
-
getOutputStepName
Get the name of the step to listen to for output.- Returns:
- the name of the step to listen to for output
-
setOutputEventName
Set the name of the event to use for output.- Parameters:
oen
- the name of the event to use for output
-
getOutputEventName
Get the name of the event to use for output.- Returns:
- the name of the event to use for output
-
setSetClass
protected void setSetClass(boolean sc) Set whether to set a class index in the sampled data.- Parameters:
sc
- true if a class index is to be set in the data
-
getSetClass
protected boolean getSetClass()Get whether a class index is to be set in the sampled data.- Returns:
- true if a class index is to be set in the sampled data
-
setClassAttributeName
Set the name of the attribute to be set as the class attribute.- Parameters:
ca
- the name of the class attribute
-
getClassAttributeName
Get the name of the attribute to be set as the class attribute.- Returns:
- the name of the class attribute
-
setStreamData
protected void setStreamData(boolean sd) Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.- Parameters:
sd
- true if data should be streamed
-
getStreamData
protected boolean getStreamData()Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.- Returns:
- true if data is to be streamed
-
getInjectStep
Return the inject step from the supplied flow (or null if not found).- Parameters:
flow
- the flow to search- Returns:
- the inject step or null if it is not in the flow
-
allocate
protected void allocate(int num) Allocate an array to hold meta data for the ARFF instances- Parameters:
num
- number of meta data objects to allocate
-
setInjectFields
protected void setInjectFields(org.pentaho.dm.commons.ArffMeta[] am) Set the array of meta data for the inject step- Parameters:
am
- an array of ArffMeta
-
getInjectFields
protected org.pentaho.dm.commons.ArffMeta[] getInjectFields()Get the meta data for the inject step- Returns:
- an array of ArffMeta
-
getXML
Return the XML describing this (configured) step- Specified by:
getXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- a
String
containing the XML
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLExceptionLoads the meta data for this (configured) step from XML.- Specified by:
loadXML
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
loadXML
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
stepnode
- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException
- if an error occurs
-
getFlow
protected Vector<Vector<?>> getFlow(String xml, org.pentaho.di.core.variables.VariableSpace space) throws Exception Load the flow (if we can).- Throws:
Exception
- if there is a problem loading the flow
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleExceptionRead this step's configuration from a repository- Specified by:
readRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
readRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to accessid_step
- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException Save this step's meta data to a repository- Specified by:
saveRep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
saveRep
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
rep
- the repository to save toid_transformation
- transformation idid_step
- step id- Throws:
org.pentaho.di.core.exception.KettleException
- if an error occurs
-
setUpMetaData
protected void setUpMetaData(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row) Set up the outgoing row meta data from the supplied Instances object.- Parameters:
insts
- the Instances to use for setting up the outgoing row meta datarow
- holds the final outgoing row meta data
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException Generates row meta data to represent the fields output by this step- Specified by:
getFields
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getFields
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
row
- the meta data for the output producedorigin
- the name of the step to be used as the origininfo
- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep
- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space
- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException
- if an error occurs
-
isOutputStructureDetermined
public boolean isOutputStructureDetermined()Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).- Returns:
- true if the output structure has been determined.
-
equals
Check for equality -
clone
Clone this step's meta data- Specified by:
clone
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
clone
in classorg.pentaho.di.trans.step.BaseStepMeta
- Returns:
- the cloned meta data
-
setDefault
public void setDefault()- Specified by:
setDefault
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.- Specified by:
check
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
check
in classorg.pentaho.di.trans.step.BaseStepMeta
- Parameters:
remarks
- the list to put the remarks in. seeorg.pentaho.di.core.CheckResult
transmeta
- the transform meta datastepMeta
- the step meta dataprev
- the fields coming from a previous stepinput
- the input step namesoutput
- the output step namesinfo
- the fields that are used as information by the step
-
getDialogClassName
- Specified by:
getDialogClassName
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Overrides:
getDialogClassName
in classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.- Specified by:
getStep
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Parameters:
stepMeta
- the step infostepDataInterface
- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr
- the copy number to get.tr
- the transformation info.trans
- the launching transformation- Returns:
- a
StepInterface
value
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepData
in interfaceorg.pentaho.di.trans.step.StepMetaInterface
- Returns:
- a
StepDataInterface
value
-