Package org.pentaho.di.kf
Class KFMeta
java.lang.Object
org.pentaho.di.trans.step.BaseStepMeta
org.pentaho.di.kf.KFMeta
- All Implemented Interfaces:
Cloneable,org.pentaho.di.trans.step.StepAttributesInterface,org.pentaho.di.trans.step.StepMetaInterface
@Step(id="KF",
image="KNWFL.svg",
name="Knowledge Flow",
description="Executes a Knowledge Flow data mining process",
documentationUrl="https://pentaho-community.atlassian.net/wiki/display/EAI/Knowledge+Flow",
categoryDescription="Data Mining")
public class KFMeta
extends org.pentaho.di.trans.step.BaseStepMeta
implements org.pentaho.di.trans.step.StepMetaInterface
Contains the meta data for the KF step.
- Version:
- $Revision$
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected org.pentaho.dm.commons.ArffMeta[]Meta data for the ARFF instances input to the inject stepprotected static Class<?>static final StringXML tag for the KF stepFields inherited from class org.pentaho.di.trans.step.BaseStepMeta
attributes, databases, log, loggingObject, parentStepMeta, repository, STEP_ATTRIBUTES_FILE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidallocate(int num) Allocate an array to hold meta data for the ARFF instancesvoidcheck(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.clone()Clone this step's meta databooleanCheck for equalityprotected StringGet the name of the attribute to be set as the class attribute.voidgetFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) Generates row meta data to represent the fields output by this stepprotected StringgetFlow()Get the knowledgeflow flow to be run.Load the flow (if we can).protected StringGet the name of the event to use for injecting.protected org.pentaho.dm.commons.ArffMeta[]Get the meta data for the inject stepprotected weka.gui.beans.BeanInstancegetInjectStep(Vector flow) Return the inject step from the supplied flow (or null if not found).protected StringGet the name of the step to inject data into.protected StringGet the name of the event to use for output.protected StringGet the name of the step to listen to for output.protected booleanGet whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)protected StringGet the random seed to use for sampling.protected StringGet the relation name to use for the sampled data.protected StringGet the number of rows to randomly sample.protected StringGet the file name of the serialized Weka flow to load/import from.protected booleanGet whether a class index is to be set in the sampled data.org.pentaho.di.trans.step.StepInterfacegetStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.org.pentaho.di.trans.step.StepDataInterfaceGet a new instance of the appropriate data class.protected booleanGet whether to store the XML flow description as part of the step meta data.protected booleanGet whether data is to be streamed to the knowledge flow when injecting rather than batch injected.getXML()Return the XML describing this (configured) stepbooleanReturns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).voidloadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) Loads the meta data for this (configured) step from XML.voidreadRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) Read this step's configuration from a repositoryvoidsaveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) Save this step's meta data to a repositoryprotected voidSet the name of the attribute to be set as the class attribute.voidprotected voidSet the actual knowledgeflow flows to run.protected voidsetInjectEventName(String ien) Set the name of the event to use for injecting.protected voidsetInjectFields(org.pentaho.dm.commons.ArffMeta[] am) Set the array of meta data for the inject stepprotected voidsetInjectStepName(String isn) Set the name of the step to inject data into.protected voidsetOutputEventName(String oen) Set the name of the event to use for output.protected voidsetOutputStepName(String osn) Set the name of the step to listen to for output.protected voidsetPassRowsThrough(boolean p) Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).protected voidsetRandomSeed(String seed) Set the random seed to use for sampling rows.protected voidsetSampleRelationName(String relationName) Set the relation name to use for the sampled data.protected voidsetSampleSize(String size) Set the number of rows to randomly sample.protected voidsetSerializedFlowFileName(String fFile) Set the file name of the serialized Weka flow to load/import from.protected voidsetSetClass(boolean sc) Set whether to set a class index in the sampled data.protected voidsetStoreFlowInStepMetaData(boolean s) Set whether to store the XML flow description as part of the step meta data.protected voidsetStreamData(boolean sd) Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.protected voidsetUpMetaData(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row) Set up the outgoing row meta data from the supplied Instances object.Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, analyseImpact, cancelQueries, check, check, createEntry, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, findAttribute, findParent, findParentEntry, getActiveReferencedObjectDescription, getDescription, getFields, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getReferencedObjectDescriptions, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepInjectionMetadataEntries, getStepIOMeta, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isReferencedObjectEnabled, isRowLevel, loadReferencedObject, loadReferencedObject, loadStepAttributes, loadXML, loadXML, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, readRep, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, setStepIOMeta, supportsErrorHandlingMethods inherited from class java.lang.Object
finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, analyseImpact, cancelQueries, check, cleanAfterHopFromRemove, cleanAfterHopFromRemove, cleanAfterHopToRemove, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, exportResources, extractStepMetadataEntries, fetchTransMeta, getActiveReferencedObjectDescription, getFields, getOptionalStreams, getParentStepMeta, getReferencedObjectDescriptions, getRequiredFields, getResourceDependencies, getSQLStatements, getSQLStatements, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasChanged, hasRepositoryReferences, isReferencedObjectEnabled, loadReferencedObject, loadXML, lookupRepositoryReferences, passDataToServletOutput, readRep, resetStepIoMeta, saveRep, searchInfoAndTargetSteps, setChanged, setParentStepMeta, supportsErrorHandling
-
Field Details
-
PKG
-
XML_TAG
XML tag for the KF step- See Also:
-
m_injectFields
protected org.pentaho.dm.commons.ArffMeta[] m_injectFieldsMeta data for the ARFF instances input to the inject step
-
-
Constructor Details
-
KFMeta
public KFMeta()
-
-
Method Details
-
setStoreFlowInStepMetaData
protected void setStoreFlowInStepMetaData(boolean s) Set whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)- Parameters:
s- true if the flow should be stored in the step meta data
-
getStoreFlowInStepMetaData
protected boolean getStoreFlowInStepMetaData()Get whether to store the XML flow description as part of the step meta data. In this case the source file path is ignored (and cleared for that matter)- Returns:
- true if the flow should be stored in the step meta data
-
setSampleRelationName
Set the relation name to use for the sampled data.- Parameters:
relationName- the relation name to use
-
getSampleRelationName
Get the relation name to use for the sampled data.- Returns:
- the relation name to use
-
getSampleSize
Get the number of rows to randomly sample.- Returns:
- the number of rows to sample
-
setSampleSize
Set the number of rows to randomly sample.- Parameters:
size- the number of rows to sample
-
getRandomSeed
Get the random seed to use for sampling.- Returns:
- the random seed
-
setRandomSeed
Set the random seed to use for sampling rows.- Parameters:
seed- the seed to use
-
getPassRowsThrough
protected boolean getPassRowsThrough()Get whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of knowledge flow being passed on)- Returns:
- true if rows are to be passed on to downstream kettle steps
-
setPassRowsThrough
protected void setPassRowsThrough(boolean p) Set whether incoming kettle rows are to be passed through to any downstream kettle steps (rather than output of the knowledge flow being passed on).- Parameters:
p- true if rows are to be passed on to downstream kettle steps
-
setSerializedFlowFileName
Set the file name of the serialized Weka flow to load/import from.- Parameters:
fFile- the file name
-
getSerializedFlowFileName
Get the file name of the serialized Weka flow to load/import from.- Returns:
- the file name of the serialized Weka flow
-
setFlow
Set the actual knowledgeflow flows to run.- Parameters:
flow- the flows to run
-
getFlow
Get the knowledgeflow flow to be run.- Returns:
- the flow to be run
-
setInjectStepName
Set the name of the step to inject data into.- Parameters:
isn- the name of the step to inject data into
-
getInjectStepName
Get the name of the step to inject data into.- Returns:
- the name of the step to inject data into
-
setInjectEventName
Set the name of the event to use for injecting.- Parameters:
ien- the name of the event to use for injecting
-
getInjectEventName
Get the name of the event to use for injecting.- Returns:
- the name of the event to use for injecting
-
setOutputStepName
Set the name of the step to listen to for output.- Parameters:
osn- the name of the step to listen to for output
-
getOutputStepName
Get the name of the step to listen to for output.- Returns:
- the name of the step to listen to for output
-
setOutputEventName
Set the name of the event to use for output.- Parameters:
oen- the name of the event to use for output
-
getOutputEventName
Get the name of the event to use for output.- Returns:
- the name of the event to use for output
-
setSetClass
protected void setSetClass(boolean sc) Set whether to set a class index in the sampled data.- Parameters:
sc- true if a class index is to be set in the data
-
getSetClass
protected boolean getSetClass()Get whether a class index is to be set in the sampled data.- Returns:
- true if a class index is to be set in the sampled data
-
setClassAttributeName
Set the name of the attribute to be set as the class attribute.- Parameters:
ca- the name of the class attribute
-
getClassAttributeName
Get the name of the attribute to be set as the class attribute.- Returns:
- the name of the class attribute
-
setStreamData
protected void setStreamData(boolean sd) Set whether data should be streamed to the knowledge flow when injecting rather than batch injected.- Parameters:
sd- true if data should be streamed
-
getStreamData
protected boolean getStreamData()Get whether data is to be streamed to the knowledge flow when injecting rather than batch injected.- Returns:
- true if data is to be streamed
-
getInjectStep
Return the inject step from the supplied flow (or null if not found).- Parameters:
flow- the flow to search- Returns:
- the inject step or null if it is not in the flow
-
allocate
protected void allocate(int num) Allocate an array to hold meta data for the ARFF instances- Parameters:
num- number of meta data objects to allocate
-
setInjectFields
protected void setInjectFields(org.pentaho.dm.commons.ArffMeta[] am) Set the array of meta data for the inject step- Parameters:
am- an array of ArffMeta
-
getInjectFields
protected org.pentaho.dm.commons.ArffMeta[] getInjectFields()Get the meta data for the inject step- Returns:
- an array of ArffMeta
-
getXML
Return the XML describing this (configured) step- Specified by:
getXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- a
Stringcontaining the XML
-
loadXML
public void loadXML(Node stepnode, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleXMLExceptionLoads the meta data for this (configured) step from XML.- Specified by:
loadXMLin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
loadXMLin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
stepnode- the step to load- Throws:
org.pentaho.di.core.exception.KettleXMLException- if an error occurs
-
getFlow
protected Vector<Vector<?>> getFlow(String xml, org.pentaho.di.core.variables.VariableSpace space) throws Exception Load the flow (if we can).- Throws:
Exception- if there is a problem loading the flow
-
readRep
public void readRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_step, List<org.pentaho.di.core.database.DatabaseMeta> dbs, Map<String, org.pentaho.di.core.Counter> counters) throws org.pentaho.di.core.exception.KettleExceptionRead this step's configuration from a repository- Specified by:
readRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
readRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to accessid_step- the id for this step- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
saveRep
public void saveRep(org.pentaho.di.repository.Repository rep, org.pentaho.di.repository.ObjectId id_transformation, org.pentaho.di.repository.ObjectId id_step) throws org.pentaho.di.core.exception.KettleException Save this step's meta data to a repository- Specified by:
saveRepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
saveRepin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
rep- the repository to save toid_transformation- transformation idid_step- step id- Throws:
org.pentaho.di.core.exception.KettleException- if an error occurs
-
setUpMetaData
protected void setUpMetaData(weka.core.Instances insts, org.pentaho.di.core.row.RowMetaInterface row) Set up the outgoing row meta data from the supplied Instances object.- Parameters:
insts- the Instances to use for setting up the outgoing row meta datarow- holds the final outgoing row meta data
-
getFields
public void getFields(org.pentaho.di.core.row.RowMetaInterface row, String origin, org.pentaho.di.core.row.RowMetaInterface[] info, org.pentaho.di.trans.step.StepMeta nextStep, org.pentaho.di.core.variables.VariableSpace space) throws org.pentaho.di.core.exception.KettleStepException Generates row meta data to represent the fields output by this step- Specified by:
getFieldsin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getFieldsin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
row- the meta data for the output producedorigin- the name of the step to be used as the origininfo- The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not.nextStep- if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.space- not sure what this is :-)- Throws:
org.pentaho.di.core.exception.KettleStepException- if an error occurs
-
isOutputStructureDetermined
public boolean isOutputStructureDetermined()Returns whether we have been able to successfully determine the structure of the output (in advance of seeing all the input rows).- Returns:
- true if the output structure has been determined.
-
equals
Check for equality -
clone
Clone this step's meta data- Specified by:
clonein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
clonein classorg.pentaho.di.trans.step.BaseStepMeta- Returns:
- the cloned meta data
-
setDefault
public void setDefault()- Specified by:
setDefaultin interfaceorg.pentaho.di.trans.step.StepMetaInterface
-
check
public void check(List<org.pentaho.di.core.CheckResultInterface> remarks, org.pentaho.di.trans.TransMeta transmeta, org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.core.row.RowMetaInterface prev, String[] input, String[] output, org.pentaho.di.core.row.RowMetaInterface info) Check the settings of this step and put findings in a remarks list.- Specified by:
checkin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
checkin classorg.pentaho.di.trans.step.BaseStepMeta- Parameters:
remarks- the list to put the remarks in. seeorg.pentaho.di.core.CheckResulttransmeta- the transform meta datastepMeta- the step meta dataprev- the fields coming from a previous stepinput- the input step namesoutput- the output step namesinfo- the fields that are used as information by the step
-
getDialogClassName
- Specified by:
getDialogClassNamein interfaceorg.pentaho.di.trans.step.StepMetaInterface- Overrides:
getDialogClassNamein classorg.pentaho.di.trans.step.BaseStepMeta
-
getStep
public org.pentaho.di.trans.step.StepInterface getStep(org.pentaho.di.trans.step.StepMeta stepMeta, org.pentaho.di.trans.step.StepDataInterface stepDataInterface, int cnr, org.pentaho.di.trans.TransMeta tr, org.pentaho.di.trans.Trans trans) Get the executing step, needed by Trans to launch a step.- Specified by:
getStepin interfaceorg.pentaho.di.trans.step.StepMetaInterface- Parameters:
stepMeta- the step infostepDataInterface- the step data interface linked to this step. Here the step can store temporary data, database connections, etc.cnr- the copy number to get.tr- the transformation info.trans- the launching transformation- Returns:
- a
StepInterfacevalue
-
getStepData
public org.pentaho.di.trans.step.StepDataInterface getStepData()Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.- Specified by:
getStepDatain interfaceorg.pentaho.di.trans.step.StepMetaInterface- Returns:
- a
StepDataInterfacevalue
-