org.pentaho.di.trans.steps.mysqlbulkloader
Class MySQLBulkLoaderMeta

java.lang.Object
  extended by org.pentaho.di.trans.step.BaseStepMeta
      extended by org.pentaho.di.trans.steps.mysqlbulkloader.MySQLBulkLoaderMeta
All Implemented Interfaces:
Cloneable, ProvidesDatabaseConnectionInformation, StepAttributesInterface, StepMetaInterface

public class MySQLBulkLoaderMeta
extends BaseStepMeta
implements StepMetaInterface, ProvidesDatabaseConnectionInformation

Here are the steps that we need to take to make streaming loading possible for MySQL:

The following steps are carried out by the step at runtime:

- create a unique FIFO file (using mkfifo, LINUX ONLY FOLKS!)
- Create a target table using standard Kettle SQL generation
- Execute the LOAD DATA SQL Command to bulk load in a separate SQL thread in the background:
- Write to the FIFO file
- At the end, close the output stream to the FIFO file
* At the end, remove the FIFO file
Created on 24-oct-2007

Author:
Matt Casters

Field Summary
static int FIELD_FORMAT_TYPE_DATE
           
static int FIELD_FORMAT_TYPE_NUMBER
           
static int FIELD_FORMAT_TYPE_OK
           
static int FIELD_FORMAT_TYPE_STRING_ESCAPE
           
static int FIELD_FORMAT_TYPE_TIMESTAMP
           
 
Fields inherited from class org.pentaho.di.trans.step.BaseStepMeta
loggingObject, STEP_ATTRIBUTES_FILE
 
Constructor Summary
MySQLBulkLoaderMeta()
           
 
Method Summary
 void allocate(int nrvalues)
           
 void analyseImpact(List<DatabaseImpact> impact, TransMeta transMeta, StepMeta stepMeta, RowMetaInterface prev, String[] input, String[] output, RowMetaInterface info)
          Each step must be able to report on the impact it has on a database, table field, etc.
 void check(List<CheckResultInterface> remarks, TransMeta transMeta, StepMeta stepMeta, RowMetaInterface prev, String[] input, String[] output, RowMetaInterface info)
          Checks the settings of this step and puts the findings in a remarks List.
 Object clone()
          Make an exact copy of this step, make sure to explicitly copy Collections etc.
 String getBulkSize()
           
 DatabaseMeta getDatabaseMeta()
          Returns the database meta for this step
 String getDelimiter()
           
 String getEnclosure()
           
 String getEncoding()
           
 String getEscapeChar()
           
 int[] getFieldFormatType()
           
static int getFieldFormatType(String codeOrDescription)
           
static String getFieldFormatTypeCode(int type)
           
static String[] getFieldFormatTypeCodes()
           
static String getFieldFormatTypeDescription(int type)
           
static String[] getFieldFormatTypeDescriptions()
           
 void getFields(RowMetaInterface rowMeta, String origin, RowMetaInterface[] info, StepMeta nextStep, VariableSpace space)
          Gets the fields.
 String[] getFieldStream()
           
 String[] getFieldTable()
           
 String getFifoFileName()
           
 String getMissingDatabaseConnectionInformationMessage()
          Provides a way for this object to return a custom message when database connection information is incomplete or missing.
 RowMetaInterface getRequiredFields(VariableSpace space)
          The natural way of data flow in a transformation is source-to-target.
 String getSchemaName()
          Returns the schema name for this step.
 SQLStatement getSQLStatements(TransMeta transMeta, StepMeta stepMeta, RowMetaInterface prev)
          Standard method to return one or more SQLStatement objects that the step needs in order to work correctly.
 StepInterface getStep(StepMeta stepMeta, StepDataInterface stepDataInterface, int cnr, TransMeta transMeta, Trans trans)
          Get the executing step, needed by Trans to launch a step.
 StepDataInterface getStepData()
          Get a new instance of the appropriate data class.
 String getTableName()
          Returns the table name for this step
 DatabaseMeta[] getUsedDatabaseConnections()
          This method returns all the database connections that are used by the step.
 String getXML()
          Produces the XML string that describes this step's information.
 boolean isIgnoringErrors()
           
 boolean isLocalFile()
           
 boolean isReplacingData()
           
 void loadXML(Node stepnode, List<DatabaseMeta> databases, Map<String,Counter> counters)
          Load the values for this step from an XML Node
 void readRep(Repository rep, ObjectId id_step, List<DatabaseMeta> databases, Map<String,Counter> counters)
          Read the steps information from a Kettle repository
 void saveRep(Repository rep, ObjectId id_transformation, ObjectId id_step)
          Save the steps data into a Kettle repository
 void setBulkSize(String bulkSize)
           
 void setDatabaseMeta(DatabaseMeta database)
           
 void setDefault()
          Set default values
 void setDelimiter(String delimiter)
           
 void setEnclosure(String enclosure)
           
 void setEncoding(String encoding)
           
 void setEscapeChar(String escapeChar)
           
 void setFieldFormatType(int[] fieldFormatType)
           
 void setFieldStream(String[] fieldStream)
           
 void setFieldTable(String[] fieldTable)
           
 void setFifoFileName(String fifoFileName)
           
 void setIgnoringErrors(boolean ignoringErrors)
           
 void setLocalFile(boolean localFile)
           
 void setReplacingData(boolean replacingData)
           
 void setSchemaName(String schemaName)
           
 void setTableName(String tableName)
           
 
Methods inherited from class org.pentaho.di.trans.step.BaseStepMeta
cancelQueries, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, findAttribute, findParent, getDescription, getDialogClassName, getLog, getLogChannelId, getName, getObjectCopy, getObjectId, getObjectRevision, getObjectType, getOptionalStreams, getParent, getParentStepMeta, getRepCode, getRepositoryDirectory, getRequiredFields, getResourceDependencies, getStepInjectionMetadataEntries, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getTooltip, getUsedArguments, getUsedLibraries, getXmlCode, handleStreamSelection, hasChanged, hasRepositoryReferences, isBasic, isDebug, isDetailed, isRowLevel, logBasic, logBasic, logDebug, logDebug, logDetailed, logDetailed, logError, logError, logError, logMinimal, logMinimal, logRowlevel, logRowlevel, lookupRepositoryReferences, resetStepIoMeta, searchInfoAndTargetSteps, setChanged, setChanged, setParentStepMeta, supportsErrorHandling
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.pentaho.di.trans.step.StepMetaInterface
cancelQueries, excludeFromCopyDistributeVerification, excludeFromRowLayoutVerification, exportResources, getDialogClassName, getOptionalStreams, getParentStepMeta, getResourceDependencies, getStepIOMeta, getStepMetaInjectionInterface, getSupportedTransformationTypes, getTableFields, getUsedArguments, getUsedLibraries, handleStreamSelection, hasRepositoryReferences, lookupRepositoryReferences, resetStepIoMeta, searchInfoAndTargetSteps, setParentStepMeta, supportsErrorHandling
 

Field Detail

FIELD_FORMAT_TYPE_OK

public static final int FIELD_FORMAT_TYPE_OK
See Also:
Constant Field Values

FIELD_FORMAT_TYPE_DATE

public static final int FIELD_FORMAT_TYPE_DATE
See Also:
Constant Field Values

FIELD_FORMAT_TYPE_TIMESTAMP

public static final int FIELD_FORMAT_TYPE_TIMESTAMP
See Also:
Constant Field Values

FIELD_FORMAT_TYPE_NUMBER

public static final int FIELD_FORMAT_TYPE_NUMBER
See Also:
Constant Field Values

FIELD_FORMAT_TYPE_STRING_ESCAPE

public static final int FIELD_FORMAT_TYPE_STRING_ESCAPE
See Also:
Constant Field Values
Constructor Detail

MySQLBulkLoaderMeta

public MySQLBulkLoaderMeta()
Method Detail

getDatabaseMeta

public DatabaseMeta getDatabaseMeta()
Description copied from interface: ProvidesDatabaseConnectionInformation
Returns the database meta for this step

Specified by:
getDatabaseMeta in interface ProvidesDatabaseConnectionInformation
Returns:
Returns the database.

setDatabaseMeta

public void setDatabaseMeta(DatabaseMeta database)
Parameters:
database - The database to set.

getTableName

public String getTableName()
Description copied from interface: ProvidesDatabaseConnectionInformation
Returns the table name for this step

Specified by:
getTableName in interface ProvidesDatabaseConnectionInformation
Returns:
Returns the tableName.

setTableName

public void setTableName(String tableName)
Parameters:
tableName - The tableName to set.

getFieldTable

public String[] getFieldTable()
Returns:
Returns the fieldTable.

setFieldTable

public void setFieldTable(String[] fieldTable)
Parameters:
fieldTable - The fieldTable to set.

getFieldStream

public String[] getFieldStream()
Returns:
Returns the fieldStream.

setFieldStream

public void setFieldStream(String[] fieldStream)
Parameters:
fieldStream - The fieldStream to set.

loadXML

public void loadXML(Node stepnode,
                    List<DatabaseMeta> databases,
                    Map<String,Counter> counters)
             throws KettleXMLException
Description copied from interface: StepMetaInterface
Load the values for this step from an XML Node

Specified by:
loadXML in interface StepMetaInterface
Parameters:
stepnode - the Node to get the info from
databases - The available list of databases to reference to
counters - Counters to reference.
Throws:
KettleXMLException - When an unexpected XML error occurred. (malformed etc.)

allocate

public void allocate(int nrvalues)

clone

public Object clone()
Description copied from interface: StepMetaInterface
Make an exact copy of this step, make sure to explicitly copy Collections etc.

Specified by:
clone in interface StepMetaInterface
Overrides:
clone in class BaseStepMeta
Returns:
an exact copy of this step

setDefault

public void setDefault()
Description copied from interface: StepMetaInterface
Set default values

Specified by:
setDefault in interface StepMetaInterface

getXML

public String getXML()
Description copied from class: BaseStepMeta
Produces the XML string that describes this step's information.

Specified by:
getXML in interface StepMetaInterface
Overrides:
getXML in class BaseStepMeta
Returns:
String containing the XML describing this step.

readRep

public void readRep(Repository rep,
                    ObjectId id_step,
                    List<DatabaseMeta> databases,
                    Map<String,Counter> counters)
             throws KettleException
Description copied from interface: StepMetaInterface
Read the steps information from a Kettle repository

Specified by:
readRep in interface StepMetaInterface
Parameters:
rep - The repository to read from
id_step - The step ID
databases - The databases to reference
counters - The counters to reference
Throws:
KettleException - When an unexpected error occurred (database, network, etc)

saveRep

public void saveRep(Repository rep,
                    ObjectId id_transformation,
                    ObjectId id_step)
             throws KettleException
Description copied from interface: StepMetaInterface
Save the steps data into a Kettle repository

Specified by:
saveRep in interface StepMetaInterface
Parameters:
rep - The Kettle repository to save to
id_transformation - The transformation ID
id_step - The step ID
Throws:
KettleException - When an unexpected error occurred (database, network, etc)

getFields

public void getFields(RowMetaInterface rowMeta,
                      String origin,
                      RowMetaInterface[] info,
                      StepMeta nextStep,
                      VariableSpace space)
               throws KettleStepException
Description copied from class: BaseStepMeta
Gets the fields.

Specified by:
getFields in interface StepMetaInterface
Overrides:
getFields in class BaseStepMeta
Parameters:
rowMeta - the input row meta
origin - the name
info - the info
nextStep - the next step
space - the space
Throws:
KettleStepException - the kettle step exception

check

public void check(List<CheckResultInterface> remarks,
                  TransMeta transMeta,
                  StepMeta stepMeta,
                  RowMetaInterface prev,
                  String[] input,
                  String[] output,
                  RowMetaInterface info)
Description copied from interface: StepMetaInterface
Checks the settings of this step and puts the findings in a remarks List.

Specified by:
check in interface StepMetaInterface
Parameters:
remarks - The list to put the remarks in @see org.pentaho.di.core.CheckResult
stepMeta - The stepMeta to help checking
prev - The fields coming from the previous step
input - The input step names
output - The output step names
info - The fields that are used as information by the step

getSQLStatements

public SQLStatement getSQLStatements(TransMeta transMeta,
                                     StepMeta stepMeta,
                                     RowMetaInterface prev)
                              throws KettleStepException
Description copied from class: BaseStepMeta
Standard method to return one or more SQLStatement objects that the step needs in order to work correctly. This can mean "create table", "create index" statements but also "alter table ... add/drop/modify" statements.

Specified by:
getSQLStatements in interface StepMetaInterface
Overrides:
getSQLStatements in class BaseStepMeta
Parameters:
transMeta - TransInfo object containing the complete transformation
stepMeta - StepMeta object containing the complete step
prev - Row containing meta-data for the input fields (no data)
Returns:
The SQL Statements for this step or null if an error occurred. If nothing has to be done, the SQLStatement.getSQL() == null.
Throws:
KettleStepException

analyseImpact

public void analyseImpact(List<DatabaseImpact> impact,
                          TransMeta transMeta,
                          StepMeta stepMeta,
                          RowMetaInterface prev,
                          String[] input,
                          String[] output,
                          RowMetaInterface info)
                   throws KettleStepException
Description copied from class: BaseStepMeta
Each step must be able to report on the impact it has on a database, table field, etc.

Specified by:
analyseImpact in interface StepMetaInterface
Overrides:
analyseImpact in class BaseStepMeta
Parameters:
impact - The list of impacts @see org.pentaho.di.transMeta.DatabaseImpact
transMeta - The transformation information
stepMeta - The step information
prev - The fields entering this step
input - The previous step names
output - The output step names
info - The fields used as information by this step
Throws:
KettleStepException

getStep

public StepInterface getStep(StepMeta stepMeta,
                             StepDataInterface stepDataInterface,
                             int cnr,
                             TransMeta transMeta,
                             Trans trans)
Description copied from interface: StepMetaInterface
Get the executing step, needed by Trans to launch a step.

Specified by:
getStep in interface StepMetaInterface
Parameters:
stepMeta - The step info
stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
cnr - The copy nr to get
transMeta - The transformation info
trans - The launching transformation

getStepData

public StepDataInterface getStepData()
Description copied from interface: StepMetaInterface
Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.

Specified by:
getStepData in interface StepMetaInterface
Returns:
The appropriate StepDataInterface class.

getUsedDatabaseConnections

public DatabaseMeta[] getUsedDatabaseConnections()
Description copied from class: BaseStepMeta
This method returns all the database connections that are used by the step.

Specified by:
getUsedDatabaseConnections in interface StepMetaInterface
Overrides:
getUsedDatabaseConnections in class BaseStepMeta
Returns:
an array of database connections meta-data. Return an empty array if no connections are used.

getRequiredFields

public RowMetaInterface getRequiredFields(VariableSpace space)
                                   throws KettleException
Description copied from class: BaseStepMeta
The natural way of data flow in a transformation is source-to-target. However, this makes mapping to target tables difficult to do. To help out here, we supply information to the transformation meta-data model about which fields are required for a step. This allows us to automate certain tasks like the mapping to pre-defined tables. The Table Output step in this case will output the fields in the target table using this method. This default implementation returns an empty row meaning that no fields are required for this step to operate.

Specified by:
getRequiredFields in interface StepMetaInterface
Overrides:
getRequiredFields in class BaseStepMeta
Parameters:
space - the variable space to use to do variable substitution.
Returns:
the required fields for this steps meta data.
Throws:
KettleException - in case the required fields can't be determined

getSchemaName

public String getSchemaName()
Description copied from interface: ProvidesDatabaseConnectionInformation
Returns the schema name for this step.

Specified by:
getSchemaName in interface ProvidesDatabaseConnectionInformation
Returns:
the schemaName

setSchemaName

public void setSchemaName(String schemaName)
Parameters:
schemaName - the schemaName to set

getEncoding

public String getEncoding()

setEncoding

public void setEncoding(String encoding)

getDelimiter

public String getDelimiter()

setDelimiter

public void setDelimiter(String delimiter)

getEnclosure

public String getEnclosure()

setEnclosure

public void setEnclosure(String enclosure)

getFifoFileName

public String getFifoFileName()
Returns:
the fifoFileName

setFifoFileName

public void setFifoFileName(String fifoFileName)
Parameters:
fifoFileName - the fifoFileName to set

isReplacingData

public boolean isReplacingData()
Returns:
the replacingData

setReplacingData

public void setReplacingData(boolean replacingData)
Parameters:
replacingData - the replacingData to set

getFieldFormatType

public int[] getFieldFormatType()

setFieldFormatType

public void setFieldFormatType(int[] fieldFormatType)

getFieldFormatTypeCodes

public static String[] getFieldFormatTypeCodes()

getFieldFormatTypeDescriptions

public static String[] getFieldFormatTypeDescriptions()

getFieldFormatTypeCode

public static String getFieldFormatTypeCode(int type)

getFieldFormatTypeDescription

public static String getFieldFormatTypeDescription(int type)

getFieldFormatType

public static int getFieldFormatType(String codeOrDescription)

getEscapeChar

public String getEscapeChar()
Returns:
the escapeChar

setEscapeChar

public void setEscapeChar(String escapeChar)
Parameters:
escapeChar - the escapeChar to set

isIgnoringErrors

public boolean isIgnoringErrors()
Returns:
the ignoringErrors

setIgnoringErrors

public void setIgnoringErrors(boolean ignoringErrors)
Parameters:
ignoringErrors - the ignoringErrors to set

getBulkSize

public String getBulkSize()
Returns:
the bulkSize

setBulkSize

public void setBulkSize(String bulkSize)
Parameters:
bulkSize - the bulkSize to set

isLocalFile

public boolean isLocalFile()
Returns:
the localFile

setLocalFile

public void setLocalFile(boolean localFile)
Parameters:
localFile - the localFile to set

getMissingDatabaseConnectionInformationMessage

public String getMissingDatabaseConnectionInformationMessage()
Description copied from interface: ProvidesDatabaseConnectionInformation
Provides a way for this object to return a custom message when database connection information is incomplete or missing. If this returns null a default message will be displayed for missing information.

Specified by:
getMissingDatabaseConnectionInformationMessage in interface ProvidesDatabaseConnectionInformation
Returns:
A friendly message that describes that database connection information is missing and, potentially, why.