org.pentaho.di.trans.steps.fuzzymatch
Class FuzzyMatchMeta

java.lang.Object
  extended by org.pentaho.di.trans.step.BaseStepMeta
      extended by org.pentaho.di.trans.steps.fuzzymatch.FuzzyMatchMeta
All Implemented Interfaces:
Cloneable, StepAttributesInterface, StepMetaInterface

public class FuzzyMatchMeta
extends BaseStepMeta
implements StepMetaInterface


Field Summary
static String[] algorithmCode
          The algorithms type codes
static String[] algorithmDesc
          The algorithms description
static String DEFAULT_SEPARATOR
           
static int OPERATION_TYPE_DAMERAU_LEVENSHTEIN
           
static int OPERATION_TYPE_DOUBLE_METAPHONE
           
static int OPERATION_TYPE_JARO
           
static int OPERATION_TYPE_JARO_WINKLER
           
static int OPERATION_TYPE_LEVENSHTEIN
           
static int OPERATION_TYPE_METAPHONE
           
static int OPERATION_TYPE_NEEDLEMAN_WUNSH
           
static int OPERATION_TYPE_PAIR_SIMILARITY
           
static int OPERATION_TYPE_REFINED_SOUNDEX
           
static int OPERATION_TYPE_SOUNDEX
           
 
Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta
loggingObject, STEP_ATTRIBUTES_FILE
 
Constructor Summary
FuzzyMatchMeta()
           
 
Method Summary
 void allocate(int nrvalues)
           
 void check(List<CheckResultInterface> remarks, TransMeta transMeta, StepMeta stepMeta, RowMetaInterface prev, String[] input, String[] output, RowMetaInterface info)
          Checks the settings of this step and puts the findings in a remarks List.
 Object clone()
          Make an exact copy of this step, make sure to explicitly copy Collections etc.
 boolean excludeFromRowLayoutVerification()
          This method is added to exclude certain steps from layout checking.
 int getAlgorithmType()
           
static int getAlgorithmTypeByDesc(String tt)
           
static String getAlgorithmTypeDesc(int i)
           
 void getFields(RowMetaInterface inputRowMeta, String name, RowMetaInterface[] info, StepMeta nextStep, VariableSpace space)
          Get the fields that are emitted by this step
 String getLookupField()
           
 String getMainStreamField()
           
 String getMaximalValue()
           
 String getMinimalValue()
           
 String getOutputMatchField()
           
 String getOutputValueField()
           
 String getSeparator()
           
 StepInterface getStep(StepMeta stepMeta, StepDataInterface stepDataInterface, int cnr, TransMeta transMeta, Trans trans)
          Get the executing step, needed by Trans to launch a step.
 StepDataInterface getStepData()
          Get a new instance of the appropriate data class.
 StepIOMetaInterface getStepIOMeta()
          Returns the Input/Output metadata for this step.
 String[] getValue()
           
 String[] getValueName()
           
 String getXML()
          Produces the XML string that describes this step's information.
 boolean isCaseSensitive()
           
 boolean isGetCloserValue()
           
 void loadXML(Node stepnode, List<DatabaseMeta> databases, Map<String,Counter> counters)
          Load the values for this step from an XML Node
 void readRep(Repository rep, ObjectId id_step, List<DatabaseMeta> databases, Map<String,Counter> counters)
          Read the steps information from a Kettle repository
 void resetStepIoMeta()
          For steps that handle dynamic input (info) or output (target) streams, it is useful to be able to force the recreate the StepIoMeta definition.
 void saveRep(Repository rep, ObjectId id_transformation, ObjectId id_step)
          Save the steps data into a Kettle repository
 void searchInfoAndTargetSteps(List<StepMeta> steps)
          Change step names into step objects to allow them to be name-changed etc.
 void setAlgorithmType(int algorithm)
           
 void setCaseSensitive(boolean caseSensitive)
           
 void setDefault()
          Set default values
 void setGetCloserValue(boolean closervalue)
           
 void setLookupField(String lookupfield)
           
 void setMainStreamField(String mainstreamfield)
           
 void setMaximalValue(String maximalValue)
           
 void setMinimalValue(String minimalValue)
           
 void setOutputMatchField(String outputmatchfield)
           
 void setOutputValueField(String outputvaluefield)
           
 void setSeparator(String separator)
           
 void setValue(String[] value)
           
 void setValueName(String[] valueName)
           
 boolean supportsErrorHandling()
           
 
Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
analyseImpact, cancelQueries, excludeFromCopyDistributeVerification, exportResources, findAttribute, findParent, getDescription, getDialogClassName, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getRepCode, getRepositoryDirectory, getRequiredFields, getRequiredFields, getResourceDependencies, getSQLStatements, getStepInjectionMetadataEntries, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isRowLevel, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, setChanged, setChanged, setParentStepMeta
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
analyseImpact, cancelQueries, excludeFromCopyDistributeVerification, exportResources, getDialogClassName, getOptionalStreams, getParentStepMeta, getRequiredFields, getResourceDependencies, getSQLStatements, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedDatabaseConnections, getUsedLibraries, handleStreamSelection, hasRepositoryReferences, lookupRepositoryReferences, setParentStepMeta
 

Field Detail

DEFAULT_SEPARATOR

public static final String DEFAULT_SEPARATOR
See Also:
Constant Field Values

algorithmDesc

public static final String[] algorithmDesc
The algorithms description


algorithmCode

public static final String[] algorithmCode
The algorithms type codes


OPERATION_TYPE_LEVENSHTEIN

public static final int OPERATION_TYPE_LEVENSHTEIN
See Also:
Constant Field Values

OPERATION_TYPE_DAMERAU_LEVENSHTEIN

public static final int OPERATION_TYPE_DAMERAU_LEVENSHTEIN
See Also:
Constant Field Values

OPERATION_TYPE_NEEDLEMAN_WUNSH

public static final int OPERATION_TYPE_NEEDLEMAN_WUNSH
See Also:
Constant Field Values

OPERATION_TYPE_JARO

public static final int OPERATION_TYPE_JARO
See Also:
Constant Field Values

OPERATION_TYPE_JARO_WINKLER

public static final int OPERATION_TYPE_JARO_WINKLER
See Also:
Constant Field Values

OPERATION_TYPE_PAIR_SIMILARITY

public static final int OPERATION_TYPE_PAIR_SIMILARITY
See Also:
Constant Field Values

OPERATION_TYPE_METAPHONE

public static final int OPERATION_TYPE_METAPHONE
See Also:
Constant Field Values

OPERATION_TYPE_DOUBLE_METAPHONE

public static final int OPERATION_TYPE_DOUBLE_METAPHONE
See Also:
Constant Field Values

OPERATION_TYPE_SOUNDEX

public static final int OPERATION_TYPE_SOUNDEX
See Also:
Constant Field Values

OPERATION_TYPE_REFINED_SOUNDEX

public static final int OPERATION_TYPE_REFINED_SOUNDEX
See Also:
Constant Field Values
Constructor Detail

FuzzyMatchMeta

public FuzzyMatchMeta()
Method Detail

getValue

public String[] getValue()
Returns:
Returns the value.

setValue

public void setValue(String[] value)
Parameters:
value - The value to set.

allocate

public void allocate(int nrvalues)

clone

public Object clone()
Description copied from interface: StepMetaInterface
Make an exact copy of this step, make sure to explicitly copy Collections etc.

Specified by:
clone in interface StepMetaInterface
Overrides:
clone in class BaseStepMeta
Returns:
an exact copy of this step

getMainStreamField

public String getMainStreamField()
Returns:
Returns the mainstreamfield.

setMainStreamField

public void setMainStreamField(String mainstreamfield)
Parameters:
mainstreamfield - The mainstreamfield to set.

getLookupField

public String getLookupField()
Returns:
Returns the lookupfield.

setLookupField

public void setLookupField(String lookupfield)
Parameters:
lookupfield - The lookupfield to set.

getOutputMatchField

public String getOutputMatchField()
Returns:
Returns the outputmatchfield.

setOutputMatchField

public void setOutputMatchField(String outputmatchfield)
Parameters:
outputmatchfield - The outputmatchfield to set.

getOutputValueField

public String getOutputValueField()
Returns:
Returns the outputmatchfield.

setOutputValueField

public void setOutputValueField(String outputvaluefield)
Parameters:
outputvaluefield - The outputvaluefield to set.

isGetCloserValue

public boolean isGetCloserValue()
Returns:
Returns the closervalue.

getValueName

public String[] getValueName()
Returns:
Returns the valueName.

setValueName

public void setValueName(String[] valueName)
Parameters:
valueName - The valueName to set.

setGetCloserValue

public void setGetCloserValue(boolean closervalue)
Parameters:
closervalue - The closervalue to set.

isCaseSensitive

public boolean isCaseSensitive()
Returns:
Returns the caseSensitive.

setCaseSensitive

public void setCaseSensitive(boolean caseSensitive)
Parameters:
caseSensitive - The caseSensitive to set.

getMinimalValue

public String getMinimalValue()
Returns:
Returns the minimalValue.

setMinimalValue

public void setMinimalValue(String minimalValue)
Parameters:
minimalValue - The minimalValue to set.

getMaximalValue

public String getMaximalValue()
Returns:
Returns the minimalValue.

setMaximalValue

public void setMaximalValue(String maximalValue)
Parameters:
maximalValue - The maximalValue to set.

getSeparator

public String getSeparator()
Returns:
Returns the separator.

setSeparator

public void setSeparator(String separator)
Parameters:
separator - The separator to set.

loadXML

public void loadXML(Node stepnode,
                    List<DatabaseMeta> databases,
                    Map<String,Counter> counters)
             throws KettleXMLException
Description copied from interface: StepMetaInterface
Load the values for this step from an XML Node

Specified by:
loadXML in interface StepMetaInterface
Parameters:
stepnode - the Node to get the info from
databases - The available list of databases to reference to
counters - Counters to reference.
Throws:
KettleXMLException - When an unexpected XML error occurred. (malformed etc.)

getAlgorithmType

public int getAlgorithmType()

setAlgorithmType

public void setAlgorithmType(int algorithm)

getAlgorithmTypeDesc

public static String getAlgorithmTypeDesc(int i)

getAlgorithmTypeByDesc

public static int getAlgorithmTypeByDesc(String tt)

setDefault

public void setDefault()
Description copied from interface: StepMetaInterface
Set default values

Specified by:
setDefault in interface StepMetaInterface

getFields

public void getFields(RowMetaInterface inputRowMeta,
                      String name,
                      RowMetaInterface[] info,
                      StepMeta nextStep,
                      VariableSpace space)
               throws KettleStepException
Description copied from interface: StepMetaInterface
Get the fields that are emitted by this step

Specified by:
getFields in interface StepMetaInterface
Overrides:
getFields in class BaseStepMeta
Parameters:
inputRowMeta - The fields that are entering the step. These are changed to reflect the output metadata.
name - The name of the step to be used as origin
info - The input rows metadata that enters the step through the specified channels in the same order as in method getInfoSteps(). The step metadata can then choose what to do with it: ignore it or not. Interesting is also that in case of database lookups, the layout of the target database table is put in info[0]
nextStep - if this is a non-null value, it's the next step in the transformation. The one who's asking, the step where the data is targetted towards.
space - TODO
Throws:
KettleStepException - when an error occurred searching for the fields.

getXML

public String getXML()
Description copied from class: BaseStepMeta
Produces the XML string that describes this step's information.

Specified by:
getXML in interface StepMetaInterface
Overrides:
getXML in class BaseStepMeta
Returns:
String containing the XML describing this step.

readRep

public void readRep(Repository rep,
                    ObjectId id_step,
                    List<DatabaseMeta> databases,
                    Map<String,Counter> counters)
             throws KettleException
Description copied from interface: StepMetaInterface
Read the steps information from a Kettle repository

Specified by:
readRep in interface StepMetaInterface
Parameters:
rep - The repository to read from
id_step - The step ID
databases - The databases to reference
counters - The counters to reference
Throws:
KettleException - When an unexpected error occurred (database, network, etc)

saveRep

public void saveRep(Repository rep,
                    ObjectId id_transformation,
                    ObjectId id_step)
             throws KettleException
Description copied from interface: StepMetaInterface
Save the steps data into a Kettle repository

Specified by:
saveRep in interface StepMetaInterface
Parameters:
rep - The Kettle repository to save to
id_transformation - The transformation ID
id_step - The step ID
Throws:
KettleException - When an unexpected error occurred (database, network, etc)

check

public void check(List<CheckResultInterface> remarks,
                  TransMeta transMeta,
                  StepMeta stepMeta,
                  RowMetaInterface prev,
                  String[] input,
                  String[] output,
                  RowMetaInterface info)
Description copied from interface: StepMetaInterface
Checks the settings of this step and puts the findings in a remarks List.

Specified by:
check in interface StepMetaInterface
Parameters:
remarks - The list to put the remarks in @see org.pentaho.di.core.CheckResult
stepMeta - The stepMeta to help checking
prev - The fields coming from the previous step
input - The input step names
output - The output step names
info - The fields that are used as information by the step

searchInfoAndTargetSteps

public void searchInfoAndTargetSteps(List<StepMeta> steps)
Description copied from class: BaseStepMeta
Change step names into step objects to allow them to be name-changed etc.

Specified by:
searchInfoAndTargetSteps in interface StepMetaInterface
Overrides:
searchInfoAndTargetSteps in class BaseStepMeta
Parameters:
steps - the steps to reference

getStep

public StepInterface getStep(StepMeta stepMeta,
                             StepDataInterface stepDataInterface,
                             int cnr,
                             TransMeta transMeta,
                             Trans trans)
Description copied from interface: StepMetaInterface
Get the executing step, needed by Trans to launch a step.

Specified by:
getStep in interface StepMetaInterface
Parameters:
stepMeta - The step info
stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
cnr - The copy nr to get
transMeta - The transformation info
trans - The launching transformation

getStepData

public StepDataInterface getStepData()
Description copied from interface: StepMetaInterface
Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.

Specified by:
getStepData in interface StepMetaInterface
Returns:
The appropriate StepDataInterface class.

excludeFromRowLayoutVerification

public boolean excludeFromRowLayoutVerification()
Description copied from class: BaseStepMeta
This method is added to exclude certain steps from layout checking.

Specified by:
excludeFromRowLayoutVerification in interface StepMetaInterface
Overrides:
excludeFromRowLayoutVerification in class BaseStepMeta

supportsErrorHandling

public boolean supportsErrorHandling()
Specified by:
supportsErrorHandling in interface StepMetaInterface
Overrides:
supportsErrorHandling in class BaseStepMeta
Returns:
true if this step supports error "reporting" on rows: the ability to send rows to a certain target step.

getStepIOMeta

public StepIOMetaInterface getStepIOMeta()
Returns the Input/Output metadata for this step. The generator step only produces output, does not accept input!

Specified by:
getStepIOMeta in interface StepMetaInterface
Overrides:
getStepIOMeta in class BaseStepMeta

resetStepIoMeta

public void resetStepIoMeta()
Description copied from interface: StepMetaInterface
For steps that handle dynamic input (info) or output (target) streams, it is useful to be able to force the recreate the StepIoMeta definition. By default this definition is cached.

Specified by:
resetStepIoMeta in interface StepMetaInterface
Overrides:
resetStepIoMeta in class BaseStepMeta