public class KFData
extends org.pentaho.di.trans.step.BaseStepData
implements org.pentaho.di.trans.step.StepDataInterface
Modifier and Type | Field and Description |
---|---|
static String[] |
INJECT_EVENTS
Allowable type of events to generate to inject data into the knowledge flow
|
protected boolean |
m_bufferingComplete
True when buffering for header creation has been completed for streaming setups
|
protected String |
m_classAttributeName
The name of the class attribute (if any).
|
protected int |
m_currentRow
the current row number
|
protected boolean |
m_hasNominalAtts
True if incoming data has nominal attributes
|
protected int |
m_incrementalStatus
Status to use for incremental instance events
|
protected org.pentaho.dm.commons.ArffMeta[] |
m_injectArffMetas
Meta data for the fields to be injected into the knowledge flow
|
protected String |
m_injectEventName
The name of the event to use for injecting (must be instance for streaming)
|
protected int[] |
m_injectFieldIndexes
Mapping to fields in incoming kettle data
|
protected org.pentaho.dm.commons.ArffMeta[] |
m_injectFields
Meta data for the ARFF instances input to the inject step
|
protected boolean |
m_injectSetupOK
If the inject setup validates, then we can process rows for input into the flow
|
protected weka.gui.beans.BeanInstance |
m_injectStep
The step in the knowledge flow to inject into
|
protected int |
m_k
the size of the sample
|
protected org.pentaho.di.core.logging.LogChannelInterface |
m_log
logging
|
protected Map<Object,Object>[] |
m_nominalVals
Maps to hold the nominal values of the ARFF data for the inject step
|
protected org.pentaho.di.core.row.RowMetaInterface |
m_outputRowMeta
The (eventual) outgoing row meta data
|
protected Random |
m_random
random number generator
|
protected List<Object[]> |
m_sample
Holds sampled rows
|
protected String |
m_sampleRelationName
Relation name for Instances created from the sampled rows
|
protected boolean |
m_streamData
True if data is to be injected one instance at a time
|
protected weka.core.Instances |
m_streamingHeader
Header for instances that are to be streamed.
|
static String[] |
OUTPUT_EVENTS
Allowable types of events that we can listen for from the knowledge flow
|
Constructor and Description |
---|
KFData() |
Modifier and Type | Method and Description |
---|---|
protected void |
allocate(int num)
Allocate an array to hold meta data for the ARFF instances
|
void |
cleanUp()
Free memory used by the reservoir/cache
|
protected static weka.gui.beans.BeanInstance |
findKFStep(Vector flow,
String stepName) |
protected static String |
flowToXML(Vector<Vector<?>> flow) |
void |
flushStreamingBuffer(org.pentaho.di.core.row.RowMetaInterface inputMeta,
Object[] inputRow) |
protected static ArrayList<String> |
getAllAllowableOutputEventNames(Vector flow,
String compName)
Get a list of all event names that we can accept from the supplied knowledgeflow component.
|
protected static ArrayList<String> |
getAllAllowableOutputStepNames(Vector flow)
Return a list of all valid output steps in the flow (i.e.
|
protected static ArrayList<String> |
getAllInjectStepNames(Vector flow)
Look for all knowledge flow components that are of type "KettleInject"
|
protected boolean |
getBufferingForStreaming()
Returns true if streaming has been selected and there are nominal attributes in the incoming data.
|
protected static ArrayList<String> |
getEventsForNamedStep(Vector flow,
String compName)
Looks for the named step, and then returns all events (from those that we can generate) that this step can accept
at present
|
org.pentaho.dm.commons.ArffMeta[] |
getInjectFields() |
protected boolean |
getInjectSetupOK()
Get the status of the flow with regards to the inject step.
|
protected org.pentaho.di.core.row.RowMetaInterface |
getOutputRowMeta()
Get the output row meta data.
|
protected boolean |
getStreamData()
Get whether data is to be streamed to the flow.
|
protected void |
initializeReservoir(int sampleSize,
int seed)
Initialize the reservoir/cache for the requested sample (or cache) size.
|
protected void |
injectData(weka.core.Instances data,
Vector beans)
Inject batch data into the knowledge flow.
|
protected void |
injectInstance(weka.core.Instance inst)
Inject a single instance into the knowledge flow.
|
protected static Vector<Vector<?>> |
loadSerializedKF(File kfFile)
Loads a serialized knowledge flow
|
protected void |
processRow(Object[] inputRow,
org.pentaho.di.core.row.RowMetaInterface inputMeta)
Processes a single incoming row.
|
protected weka.core.Instances |
reservoirToInstances(org.pentaho.di.core.row.RowMetaInterface inputRowMeta)
Convert the contents of the reservoir into a set of Instances.
|
protected void |
setClassAttributeName(String classAttName)
Set the name of the attribute to use as the class.
|
protected void |
setIncrementalStatus(int status)
Status of the current incremental event (FORMAT_AVAILABLE or INSTANCE_AVAILABLE).
|
protected void |
setInjectEventName(String injectEventName)
Set the name of the event to inject data using.
|
protected void |
setInjectFieldIndexes(int[] injectFieldIndexes,
org.pentaho.dm.commons.ArffMeta[] arffMetas)
Set the indexes of the fields to inject into the knowledge flow
|
void |
setInjectFields(org.pentaho.dm.commons.ArffMeta[] arffMeta) |
protected void |
setInjectSetupOK(boolean ok)
Sets the status of the flow with regards to the step to inject into.
|
protected void |
setInjectStep(weka.gui.beans.BeanInstance injectStep)
Set the knowledge flow step to inject data into
|
void |
setLog(org.pentaho.di.core.logging.LogChannelInterface log) |
protected void |
setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
Set the meta data for the incoming rows (later gets modified) into the output format by getFields() in the meta
class
|
protected void |
setSampleRelationName(String relationName)
Set the relation name for the sample.
|
protected void |
setStreamData(boolean streamData)
Set whether the flow is to have data streamed to it (i.e.
|
protected void |
setupArffMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
Sets up the ArffMeta array based on the incomming Kettle row format.
|
protected static void |
validateInjectSetup(Vector flow,
String injectStepName,
String injectEventName)
Attempts to find the requested inject step and check if the requested event type is one that this step generates
(and will allow a connection).
|
protected static void |
validateOutputSetup(Vector flow,
Object outputListener,
String outputStepName,
String outputEventName)
Attempts to find the requested output step, checks that the requested output event type is generatable and then
tries to register the supplied listener with the output step.
|
protected static Vector<Vector<?>> |
xmlToFlow(String xml) |
getStatus, isDisposed, isEmpty, isFinished, isIdle, isInitialising, isRunning, isStopped, setStatus
protected org.pentaho.di.core.row.RowMetaInterface m_outputRowMeta
protected org.pentaho.dm.commons.ArffMeta[] m_injectFields
protected boolean m_injectSetupOK
protected org.pentaho.dm.commons.ArffMeta[] m_injectArffMetas
protected int[] m_injectFieldIndexes
protected boolean m_hasNominalAtts
protected boolean m_streamData
protected int m_incrementalStatus
protected weka.core.Instances m_streamingHeader
protected String m_classAttributeName
protected boolean m_bufferingComplete
protected weka.gui.beans.BeanInstance m_injectStep
protected String m_injectEventName
protected Map<Object,Object>[] m_nominalVals
protected int m_k
protected int m_currentRow
protected String m_sampleRelationName
protected Random m_random
protected org.pentaho.di.core.logging.LogChannelInterface m_log
public static final String[] INJECT_EVENTS
public static final String[] OUTPUT_EVENTS
public void setLog(org.pentaho.di.core.logging.LogChannelInterface log)
public void setInjectFields(org.pentaho.dm.commons.ArffMeta[] arffMeta)
public org.pentaho.dm.commons.ArffMeta[] getInjectFields()
protected void allocate(int num)
num
- number of meta data objects to allocateprotected void setupArffMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
rmi
- a RowMetaInterface
valueprotected void initializeReservoir(int sampleSize, int seed)
sampleSize
- the number of rows to sample or to cacheseed
- the random number seed to useprotected void setInjectFieldIndexes(int[] injectFieldIndexes, org.pentaho.dm.commons.ArffMeta[] arffMetas)
injectFieldIndexes
- array of indexesarffMetas
- array of arff metasprotected void setInjectSetupOK(boolean ok)
ok
- true if the inject step is good to goprotected boolean getInjectSetupOK()
protected void setStreamData(boolean streamData)
streamData
- true if data is to be streamed to the knowledge flowprotected boolean getStreamData()
protected void setIncrementalStatus(int status)
status
- the status of the current incremental eventprotected boolean getBufferingForStreaming()
protected void setClassAttributeName(String classAttName)
classAttName
- the name of the class attributeprotected void setOutputRowMeta(org.pentaho.di.core.row.RowMetaInterface rmi)
rmi
- the incoming row meta dataprotected org.pentaho.di.core.row.RowMetaInterface getOutputRowMeta()
protected void setSampleRelationName(String relationName)
relationName
- the relation name to use for the sampleprotected void setInjectStep(weka.gui.beans.BeanInstance injectStep)
injectStep
- the knowledge flow step to inject data intoprotected void setInjectEventName(String injectEventName)
injectEventName
- the name of the event to inject data withprotected static Vector<Vector<?>> loadSerializedKF(File kfFile) throws Exception
kfFile
- a File
valueVector
containing a Vector of beans and a Vector of connectionsException
- if there is a problem during loadingprotected static String flowToXML(Vector<Vector<?>> flow) throws Exception
Exception
protected static Vector<Vector<?>> xmlToFlow(String xml) throws Exception
Exception
protected weka.core.Instances reservoirToInstances(org.pentaho.di.core.row.RowMetaInterface inputRowMeta) throws org.pentaho.di.core.exception.KettleException
inputRowMeta
- the meta data for the incoming rowsorg.pentaho.di.core.exception.KettleException
- if there is a problem during the conversionprotected void injectInstance(weka.core.Instance inst)
inst
- the instance to injectprotected void injectData(weka.core.Instances data, Vector beans) throws Exception
data
- the instances to injectbeans
- the flow being executedException
protected void processRow(Object[] inputRow, org.pentaho.di.core.row.RowMetaInterface inputMeta) throws Exception
inputRow
- the incoming Kettle rowinputMeta
- the row meta dataException
- if something goes wrongpublic void flushStreamingBuffer(org.pentaho.di.core.row.RowMetaInterface inputMeta, Object[] inputRow) throws org.pentaho.di.core.exception.KettleException
org.pentaho.di.core.exception.KettleException
public void cleanUp()
protected static ArrayList<String> getAllAllowableOutputStepNames(Vector flow)
flow
- the flow to useprotected static ArrayList<String> getAllAllowableOutputEventNames(Vector flow, String compName)
flow
- the flow to usecompName
- the name of the component to generate the list forprotected static ArrayList<String> getAllInjectStepNames(Vector flow)
flow
- the flow to useprotected static ArrayList<String> getEventsForNamedStep(Vector flow, String compName)
flow
- the flow to usecompName
- the knowledge flow step in questionan
- ArrayList of the acceptable events (connections)protected static void validateOutputSetup(Vector flow, Object outputListener, String outputStepName, String outputEventName) throws org.pentaho.di.core.exception.KettleStepException
outputListener
- the object to register with the output step (may be null if just validation is required)flow
- a Vector containing a Vector of beans and a Vector of connectionsoutputStepName
- the name of the output Knowledge Flow stepoutputEventName
- the name of the even produced by the output step to listen fororg.pentaho.di.core.exception.KettleStepException
- if a problem occursprotected static void validateInjectSetup(Vector flow, String injectStepName, String injectEventName) throws org.pentaho.di.core.exception.KettleStepException
the
- flow to useinjectStepName
- the name of the Knowledge Flow step to inject data intoinjectEventName
- the name of the event type to use to pass data to the inject steporg.pentaho.di.core.exception.KettleStepException
- if validation fails for some reasonCopyright © 2002–2019 Hitachi Vantara. All rights reserved.