org.pentaho.di.trans.steps.univariatestats
Class UnivariateStatsMetaFunction

java.lang.Object
  extended by org.pentaho.di.trans.steps.univariatestats.UnivariateStatsMetaFunction
All Implemented Interfaces:
Cloneable

public class UnivariateStatsMetaFunction
extends Object
implements Cloneable

Holds meta information about one univariate stats calculation: source field name and what derived values are to be computed

Version:
1.0
Author:
Mark Hall (mhall{[at]}pentaho.org

Field Summary
static String XML_TAG
           
 
Constructor Summary
UnivariateStatsMetaFunction(Node uniNode)
          Construct from an XML node
UnivariateStatsMetaFunction(Repository rep, ObjectId id_step, int nr)
          Construct using data stored in repository
UnivariateStatsMetaFunction(String sourceFieldName, boolean n, boolean mean, boolean stdDev, boolean min, boolean max, boolean median, double arbPercentile, boolean interpolate)
          Creates a new UnivariateStatsMetaFunction
 
Method Summary
 Object clone()
          Make a copy
 boolean equals(Object obj)
          Check for equality
 boolean getCalcMax()
          Get whether the maximum is to be calculated for this input value
 boolean getCalcMean()
          Get whether the mean is to be calculated for this input field
 boolean getCalcMedian()
          Get whether the median is to be calculated for this input value
 boolean getCalcMin()
          Get whether the minimum is to be calculated for this input value
 boolean getCalcN()
          Get whether N is to be calculated for this input field
 double getCalcPercentile()
          Gets whether an arbitrary percentile is to be calculated for this input field
 boolean getCalcStdDev()
          Get whether the standard deviation is to be calculated for this input value
 boolean getInterpolatePercentile()
          Get whether interpolation is to be used in the computation of percentiles
 String getSourceFieldName()
          Return the name of the input field used by this UnivariateStatsMetaFunction
 String getXML()
          Return a String containing XML describing this UnivariateStatsMetaFunction
 int numberOfMetricsRequested()
          Returns the number of metrics to compute
 void saveRep(Repository rep, ObjectId id_transformation, ObjectId id_step, int nr)
          Save this UnivariateStatsMetaFunction to a repository
 void setCalcMax(boolean b)
          Set whether the maximum is to be calculated for this input value
 void setCalcMean(boolean b)
          Set whether to calculate the mean for this input field
 void setCalcMedian(boolean b)
          Set whether the median is to be calculated for this input value
 void setCalcMin(boolean b)
          Set whether the minimum is to be calculated for this input value
 void setCalcN(boolean n)
          Set whether to calculate N for this input field
 void setCalcPercentile(double percentile)
          Sets whether an arbitrary percentile is to be calculated for this input field
 void setCalcStdDev(boolean b)
          Set whether the standard deviation is to be calculated for this input value
 void setInterpolatePercentile(boolean i)
          Set whether interpolation is to be used in the computation of percentiles
 void setSourceFieldName(String sn)
          Set the name of the input field used by this UnivariateStatsMetaFunction.
 
Methods inherited from class java.lang.Object
getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

XML_TAG

public static final String XML_TAG
See Also:
Constant Field Values
Constructor Detail

UnivariateStatsMetaFunction

public UnivariateStatsMetaFunction(String sourceFieldName,
                                   boolean n,
                                   boolean mean,
                                   boolean stdDev,
                                   boolean min,
                                   boolean max,
                                   boolean median,
                                   double arbPercentile,
                                   boolean interpolate)
Creates a new UnivariateStatsMetaFunction

Parameters:
sourceFieldName - the name of the input field to compute stats for
n - output N
mean - compute and output the mean
stdDev - compute and output the standard deviation
min - output the minumum value
max - output the maximum value
median - compute and output the median (requires data caching and sorting)
arbPercentile - compute and output a percentile (0 <= arbPercentile <= 1)
interpolate - true if interpolation is to be used for percentiles (rather than a simple method). See The Engineering Statistics Handbook for details.

UnivariateStatsMetaFunction

public UnivariateStatsMetaFunction(Node uniNode)
Construct from an XML node

Parameters:
uniNode - a XML node

UnivariateStatsMetaFunction

public UnivariateStatsMetaFunction(Repository rep,
                                   ObjectId id_step,
                                   int nr)
                            throws KettleException
Construct using data stored in repository

Parameters:
rep - the repository
id_step - the id of the step
nr - the step number
Throws:
KettleException - if an error occurs
Method Detail

equals

public boolean equals(Object obj)
Check for equality

Overrides:
equals in class Object
Parameters:
obj - an UnivarateStatsMetaFunction to compare against
Returns:
true if this Object and the supplied one are the same

getXML

public String getXML()
Return a String containing XML describing this UnivariateStatsMetaFunction

Returns:
an XML description of this UnivarateStatsMetaFunction

saveRep

public void saveRep(Repository rep,
                    ObjectId id_transformation,
                    ObjectId id_step,
                    int nr)
             throws KettleException
Save this UnivariateStatsMetaFunction to a repository

Parameters:
rep - the repository to save to
id_transformation - the transformation id
id_step - the step id
nr - the step number
Throws:
KettleException - if an error occurs

clone

public Object clone()
Make a copy

Overrides:
clone in class Object
Returns:
a copy of this UnivariateStatsMetaFunction.

setSourceFieldName

public void setSourceFieldName(String sn)
Set the name of the input field used by this UnivariateStatsMetaFunction.

Parameters:
sn - the name of the source field to use

getSourceFieldName

public String getSourceFieldName()
Return the name of the input field used by this UnivariateStatsMetaFunction

Returns:
the name of the input field used

setCalcN

public void setCalcN(boolean n)
Set whether to calculate N for this input field

Parameters:
n - true if N is to be calculated

getCalcN

public boolean getCalcN()
Get whether N is to be calculated for this input field

Returns:
true if N is to be calculated

setCalcMean

public void setCalcMean(boolean b)
Set whether to calculate the mean for this input field

Parameters:
b - true if the mean is to be calculated

getCalcMean

public boolean getCalcMean()
Get whether the mean is to be calculated for this input field

Returns:
true if the mean is to be calculated

setCalcStdDev

public void setCalcStdDev(boolean b)
Set whether the standard deviation is to be calculated for this input value

Parameters:
b - true if the standard deviation is to be calculated

getCalcStdDev

public boolean getCalcStdDev()
Get whether the standard deviation is to be calculated for this input value

Returns:
true if the standard deviation is to be calculated

setCalcMin

public void setCalcMin(boolean b)
Set whether the minimum is to be calculated for this input value

Parameters:
b - true if the minimum is to be calculated

getCalcMin

public boolean getCalcMin()
Get whether the minimum is to be calculated for this input value

Returns:
true if the minimum is to be calculated

setCalcMax

public void setCalcMax(boolean b)
Set whether the maximum is to be calculated for this input value

Parameters:
b - true if the maximum is to be calculated

getCalcMax

public boolean getCalcMax()
Get whether the maximum is to be calculated for this input value

Returns:
true if the maximum is to be calculated

setCalcMedian

public void setCalcMedian(boolean b)
Set whether the median is to be calculated for this input value

Parameters:
b - true if the median is to be calculated

getCalcMedian

public boolean getCalcMedian()
Get whether the median is to be calculated for this input value

Returns:
true if the median is to be calculated

getInterpolatePercentile

public boolean getInterpolatePercentile()
Get whether interpolation is to be used in the computation of percentiles

Returns:
true if interpolation is to be used

setInterpolatePercentile

public void setInterpolatePercentile(boolean i)
Set whether interpolation is to be used in the computation of percentiles

Parameters:
i - true is interpolation is to be used

getCalcPercentile

public double getCalcPercentile()
Gets whether an arbitrary percentile is to be calculated for this input field

Returns:
true if a percentile is to be computed

setCalcPercentile

public void setCalcPercentile(double percentile)
Sets whether an arbitrary percentile is to be calculated for this input field

Parameters:
percentile - the percentile to compute (0 <= percentile <= 100)

numberOfMetricsRequested

public int numberOfMetricsRequested()
Returns the number of metrics to compute

Returns:
the number of metrics to compute