Class MySQLBulkLoaderMeta

  • All Implemented Interfaces:
    Cloneable, ProvidesDatabaseConnectionInformation, StepAttributesInterface, StepMetaInterface

    public class MySQLBulkLoaderMeta
    extends BaseStepMeta
    implements StepMetaInterface, ProvidesDatabaseConnectionInformation
    Here are the steps that we need to take to make streaming loading possible for MySQL:

    The following steps are carried out by the step at runtime:

    - create a unique FIFO file (using mkfifo, LINUX ONLY FOLKS!)
    - Create a target table using standard Kettle SQL generation
    - Execute the LOAD DATA SQL Command to bulk load in a separate SQL thread in the background:
    - Write to the FIFO file
    - At the end, close the output stream to the FIFO file
    * At the end, remove the FIFO file
    Created on 24-oct-2007
    Author:
    Matt Casters
    • Field Detail

      • FIELD_FORMAT_TYPE_TIMESTAMP

        public static final int FIELD_FORMAT_TYPE_TIMESTAMP
        See Also:
        Constant Field Values
      • FIELD_FORMAT_TYPE_NUMBER

        public static final int FIELD_FORMAT_TYPE_NUMBER
        See Also:
        Constant Field Values
      • FIELD_FORMAT_TYPE_STRING_ESCAPE

        public static final int FIELD_FORMAT_TYPE_STRING_ESCAPE
        See Also:
        Constant Field Values
    • Constructor Detail

      • MySQLBulkLoaderMeta

        public MySQLBulkLoaderMeta()
    • Method Detail

      • setDatabaseMeta

        public void setDatabaseMeta​(org.pentaho.di.core.database.DatabaseMeta database)
        Parameters:
        database - The database to set.
      • setTableName

        public void setTableName​(String tableName)
        Parameters:
        tableName - The tableName to set.
      • getFieldTable

        public String[] getFieldTable()
        Returns:
        Returns the fieldTable.
      • setFieldTable

        public void setFieldTable​(String[] fieldTable)
        Parameters:
        fieldTable - The fieldTable to set.
      • getFieldStream

        public String[] getFieldStream()
        Returns:
        Returns the fieldStream.
      • setFieldStream

        public void setFieldStream​(String[] fieldStream)
        Parameters:
        fieldStream - The fieldStream to set.
      • loadXML

        public void loadXML​(Node stepnode,
                            List<org.pentaho.di.core.database.DatabaseMeta> databases,
                            org.pentaho.metastore.api.IMetaStore metaStore)
                     throws org.pentaho.di.core.exception.KettleXMLException
        Description copied from interface: StepMetaInterface
        Load the values for this step from an XML Node
        Specified by:
        loadXML in interface StepMetaInterface
        Overrides:
        loadXML in class BaseStepMeta
        Parameters:
        stepnode - the Node to get the info from
        databases - The available list of databases to reference to
        metaStore - the metastore to optionally load external reference metadata from
        Throws:
        org.pentaho.di.core.exception.KettleXMLException - When an unexpected XML error occurred. (malformed etc.)
      • allocate

        public void allocate​(int nrvalues)
      • getXML

        public String getXML()
        Description copied from class: BaseStepMeta
        Produces the XML string that describes this step's information.
        Specified by:
        getXML in interface StepMetaInterface
        Overrides:
        getXML in class BaseStepMeta
        Returns:
        String containing the XML describing this step.
      • readRep

        public void readRep​(Repository rep,
                            org.pentaho.metastore.api.IMetaStore metaStore,
                            org.pentaho.di.repository.ObjectId id_step,
                            List<org.pentaho.di.core.database.DatabaseMeta> databases)
                     throws org.pentaho.di.core.exception.KettleException
        Description copied from interface: StepMetaInterface
        Read the steps information from a Kettle repository
        Specified by:
        readRep in interface StepMetaInterface
        Overrides:
        readRep in class BaseStepMeta
        Parameters:
        rep - The repository to read from
        metaStore - The MetaStore to read external information from
        id_step - The step ID
        databases - The databases to reference
        Throws:
        org.pentaho.di.core.exception.KettleException - When an unexpected error occurred (database, network, etc)
      • saveRep

        public void saveRep​(Repository rep,
                            org.pentaho.metastore.api.IMetaStore metaStore,
                            org.pentaho.di.repository.ObjectId id_transformation,
                            org.pentaho.di.repository.ObjectId id_step)
                     throws org.pentaho.di.core.exception.KettleException
        Description copied from interface: StepMetaInterface
        Save the steps data into a Kettle repository
        Specified by:
        saveRep in interface StepMetaInterface
        Overrides:
        saveRep in class BaseStepMeta
        Parameters:
        rep - The Kettle repository to save to
        metaStore - the metaStore to optionally write to
        id_transformation - The transformation ID
        id_step - The step ID
        Throws:
        org.pentaho.di.core.exception.KettleException - When an unexpected error occurred (database, network, etc)
      • getFields

        public void getFields​(org.pentaho.di.core.row.RowMetaInterface rowMeta,
                              String origin,
                              org.pentaho.di.core.row.RowMetaInterface[] info,
                              StepMeta nextStep,
                              org.pentaho.di.core.variables.VariableSpace space,
                              Repository repository,
                              org.pentaho.metastore.api.IMetaStore metaStore)
                       throws org.pentaho.di.core.exception.KettleStepException
        Description copied from class: BaseStepMeta
        Gets the fields.
        Specified by:
        getFields in interface StepMetaInterface
        Overrides:
        getFields in class BaseStepMeta
        Parameters:
        rowMeta - the input row meta that is modified in this method to reflect the output row metadata of the step
        origin - Name of the step to use as input for the origin field in the values
        info - Fields used as extra lookup information
        nextStep - the next step that is targeted
        space - the space The variable space to use to replace variables
        repository - the repository to use to load Kettle metadata objects impacting the output fields
        metaStore - the MetaStore to use to load additional external data or metadata impacting the output fields
        Throws:
        org.pentaho.di.core.exception.KettleStepException - the kettle step exception
      • check

        public void check​(List<org.pentaho.di.core.CheckResultInterface> remarks,
                          TransMeta transMeta,
                          StepMeta stepMeta,
                          org.pentaho.di.core.row.RowMetaInterface prev,
                          String[] input,
                          String[] output,
                          org.pentaho.di.core.row.RowMetaInterface info,
                          org.pentaho.di.core.variables.VariableSpace space,
                          Repository repository,
                          org.pentaho.metastore.api.IMetaStore metaStore)
        Description copied from interface: StepMetaInterface
        Checks the settings of this step and puts the findings in a remarks List.
        Specified by:
        check in interface StepMetaInterface
        Overrides:
        check in class BaseStepMeta
        Parameters:
        remarks - The list to put the remarks in @see org.pentaho.di.core.CheckResult
        stepMeta - The stepMeta to help checking
        prev - The fields coming from the previous step
        input - The input step names
        output - The output step names
        info - The fields that are used as information by the step
        space - the variable space to resolve variable expressions with
        repository - the repository to use to load Kettle metadata objects impacting the output fields
        metaStore - the MetaStore to use to load additional external data or metadata impacting the output fields
      • getSQLStatements

        public org.pentaho.di.core.SQLStatement getSQLStatements​(TransMeta transMeta,
                                                                 StepMeta stepMeta,
                                                                 org.pentaho.di.core.row.RowMetaInterface prev,
                                                                 Repository repository,
                                                                 org.pentaho.metastore.api.IMetaStore metaStore)
                                                          throws org.pentaho.di.core.exception.KettleStepException
        Description copied from class: BaseStepMeta
        Standard method to return an SQLStatement object with SQL statements that the step needs in order to work correctly. This can mean "create table", "create index" statements but also "alter table ... add/drop/modify" statements.
        Specified by:
        getSQLStatements in interface StepMetaInterface
        Overrides:
        getSQLStatements in class BaseStepMeta
        Parameters:
        transMeta - TransInfo object containing the complete transformation
        stepMeta - StepMeta object containing the complete step
        prev - Row containing meta-data for the input fields (no data)
        repository - the repository to use to load Kettle metadata objects impacting the output fields
        metaStore - the MetaStore to use to load additional external data or metadata impacting the output fields
        Returns:
        The SQL Statements for this step. If nothing has to be done, the SQLStatement.getSQL() == null. @see SQLStatement
        Throws:
        org.pentaho.di.core.exception.KettleStepException
      • analyseImpact

        public void analyseImpact​(List<DatabaseImpact> impact,
                                  TransMeta transMeta,
                                  StepMeta stepMeta,
                                  org.pentaho.di.core.row.RowMetaInterface prev,
                                  String[] input,
                                  String[] output,
                                  org.pentaho.di.core.row.RowMetaInterface info,
                                  Repository repository,
                                  org.pentaho.metastore.api.IMetaStore metaStore)
                           throws org.pentaho.di.core.exception.KettleStepException
        Description copied from class: BaseStepMeta
        Each step must be able to report on the impact it has on a database, table field, etc.
        Specified by:
        analyseImpact in interface StepMetaInterface
        Overrides:
        analyseImpact in class BaseStepMeta
        Parameters:
        impact - The list of impacts @see org.pentaho.di.transMeta.DatabaseImpact
        transMeta - The transformation information
        stepMeta - The step information
        prev - The fields entering this step
        input - The previous step names
        output - The output step names
        info - The fields used as information by this step
        repository - the repository to use to load Kettle metadata objects impacting the output fields
        metaStore - the MetaStore to use to load additional external data or metadata impacting the output fields
        Throws:
        org.pentaho.di.core.exception.KettleStepException
      • getStep

        public StepInterface getStep​(StepMeta stepMeta,
                                     StepDataInterface stepDataInterface,
                                     int cnr,
                                     TransMeta transMeta,
                                     Trans trans)
        Description copied from interface: StepMetaInterface
        Get the executing step, needed by Trans to launch a step.
        Specified by:
        getStep in interface StepMetaInterface
        Parameters:
        stepMeta - The step info
        stepDataInterface - the step data interface linked to this step. Here the step can store temporary data, database connections, etc.
        cnr - The copy nr to get
        transMeta - The transformation info
        trans - The launching transformation
      • getStepData

        public StepDataInterface getStepData()
        Description copied from interface: StepMetaInterface
        Get a new instance of the appropriate data class. This data class implements the StepDataInterface. It basically contains the persisting data that needs to live on, even if a worker thread is terminated.
        Specified by:
        getStepData in interface StepMetaInterface
        Returns:
        The appropriate StepDataInterface class.
      • getUsedDatabaseConnections

        public org.pentaho.di.core.database.DatabaseMeta[] getUsedDatabaseConnections()
        Description copied from class: BaseStepMeta
        This method returns all the database connections that are used by the step.
        Specified by:
        getUsedDatabaseConnections in interface StepMetaInterface
        Overrides:
        getUsedDatabaseConnections in class BaseStepMeta
        Returns:
        an array of database connections meta-data. Return an empty array if no connections are used.
      • getRequiredFields

        public org.pentaho.di.core.row.RowMetaInterface getRequiredFields​(org.pentaho.di.core.variables.VariableSpace space)
                                                                   throws org.pentaho.di.core.exception.KettleException
        Description copied from class: BaseStepMeta
        The natural way of data flow in a transformation is source-to-target. However, this makes mapping to target tables difficult to do. To help out here, we supply information to the transformation meta-data model about which fields are required for a step. This allows us to automate certain tasks like the mapping to pre-defined tables. The Table Output step in this case will output the fields in the target table using this method.

        This default implementation returns an empty row meaning that no fields are required for this step to operate.

        Specified by:
        getRequiredFields in interface StepMetaInterface
        Overrides:
        getRequiredFields in class BaseStepMeta
        Parameters:
        space - the variable space to use to do variable substitution.
        Returns:
        the required fields for this steps meta data.
        Throws:
        org.pentaho.di.core.exception.KettleException - in case the required fields can't be determined
      • setSchemaName

        public void setSchemaName​(String schemaName)
        Parameters:
        schemaName - the schemaName to set
      • getEncoding

        public String getEncoding()
      • setEncoding

        public void setEncoding​(String encoding)
      • getDelimiter

        public String getDelimiter()
      • setDelimiter

        public void setDelimiter​(String delimiter)
      • getEnclosure

        public String getEnclosure()
      • setEnclosure

        public void setEnclosure​(String enclosure)
      • getFifoFileName

        public String getFifoFileName()
        Returns:
        the fifoFileName
      • setFifoFileName

        public void setFifoFileName​(String fifoFileName)
        Parameters:
        fifoFileName - the fifoFileName to set
      • isReplacingData

        public boolean isReplacingData()
        Returns:
        the replacingData
      • setReplacingData

        public void setReplacingData​(boolean replacingData)
        Parameters:
        replacingData - the replacingData to set
      • getFieldFormatType

        public int[] getFieldFormatType()
      • setFieldFormatType

        public void setFieldFormatType​(int[] fieldFormatType)
      • getFieldFormatTypeCodes

        public static String[] getFieldFormatTypeCodes()
      • getFieldFormatTypeDescriptions

        public static String[] getFieldFormatTypeDescriptions()
      • getFieldFormatTypeCode

        public static String getFieldFormatTypeCode​(int type)
      • getFieldFormatTypeDescription

        public static String getFieldFormatTypeDescription​(int type)
      • getFieldFormatType

        public static int getFieldFormatType​(String codeOrDescription)
      • getEscapeChar

        public String getEscapeChar()
        Returns:
        the escapeChar
      • setEscapeChar

        public void setEscapeChar​(String escapeChar)
        Parameters:
        escapeChar - the escapeChar to set
      • isIgnoringErrors

        public boolean isIgnoringErrors()
        Returns:
        the ignoringErrors
      • setIgnoringErrors

        public void setIgnoringErrors​(boolean ignoringErrors)
        Parameters:
        ignoringErrors - the ignoringErrors to set
      • getBulkSize

        public String getBulkSize()
        Returns:
        the bulkSize
      • setBulkSize

        public void setBulkSize​(String bulkSize)
        Parameters:
        bulkSize - the bulkSize to set
      • isLocalFile

        public boolean isLocalFile()
        Returns:
        the localFile
      • setLocalFile

        public void setLocalFile​(boolean localFile)
        Parameters:
        localFile - the localFile to set
      • afterInjectionSynchronization

        public void afterInjectionSynchronization()
        If we use injection we can have different arrays lengths. We need synchronize them for consistency behavior with UI