Parallel Colt 0.7.2

cern.jet.stat.tdouble.quantile
Class DoubleEquiDepthHistogram

java.lang.Object
  extended by cern.colt.PersistentObject
      extended by cern.jet.stat.tdouble.quantile.DoubleEquiDepthHistogram
All Implemented Interfaces:
Serializable, Cloneable

public class DoubleEquiDepthHistogram
extends PersistentObject

Read-only equi-depth histogram for selectivity estimation. Assume you have collected statistics over a data set, among them a one-dimensional equi-depth histogram (quantiles). Then an applications or DBMS might want to estimate the selectivity of some range query [from,to], i.e. the percentage of data set elements contained in the query range. This class does not collect equi-depth histograms but only space efficiently stores already produced histograms and provides operations for selectivity estimation. Uses linear interpolation.

This class stores a list l of double values for which holds:

  • Let v be a list of values (sorted ascending) an equi-depth histogram has been computed over.
  • Let s=l.length.
  • Let p=(0, 1/s-1), 2/s-1,..., s-1/s-1=1.0) be a list of the s percentages.
  • Then for each i=0..s-1: l[i] = e : v.contains(e) && v[0],..., v[p[i]*v.length] <= e .
  • (In particular: l[0]=min(v)=v[0] and l[s-1]=max(v)=v[s-1].)
  • Version:
    1.0, 09/24/99
    Author:
    wolfgang.hoschek@cern.ch
    See Also:
    Serialized Form

    Field Summary
     
    Fields inherited from class cern.colt.PersistentObject
    serialVersionUID
     
    Constructor Summary
    DoubleEquiDepthHistogram(double[] quantileElements)
              Constructs an equi-depth histogram with the given quantile elements.
     
    Method Summary
     int binOfElement(double element)
              Returns the bin index of the given element.
     int bins()
              Returns the number of bins.
     double endOfBin(int binIndex)
              Returns the end of the range associated with the given bin.
     double percentFromTo(double from, double to)
              Returns the percentage of elements in the range (from,to].
     double phi(double element)
              Returns how many percent of the elements contained in the receiver are <= element.
     int size()
              Deprecated. Deprecated. Returns the number of bin boundaries.
     double startOfBin(int binIndex)
              Returns the start of the range associated with the given bin.
    static void test(double element)
              Not yet commented.
     
    Methods inherited from class cern.colt.PersistentObject
    clone
     
    Methods inherited from class java.lang.Object
    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Constructor Detail

    DoubleEquiDepthHistogram

    public DoubleEquiDepthHistogram(double[] quantileElements)
    Constructs an equi-depth histogram with the given quantile elements. Quantile elements must be sorted ascending and have the form specified in the class documentation.

    Method Detail

    binOfElement

    public int binOfElement(double element)
    Returns the bin index of the given element. In other words, returns a handle to the range the element falls into.

    Parameters:
    element - the element to search for.
    Throws:
    IllegalArgumentException - if the element is not contained in any bin.

    bins

    public int bins()
    Returns the number of bins. In other words, returns the number of subdomains partitioning the entire value domain.


    endOfBin

    public double endOfBin(int binIndex)
    Returns the end of the range associated with the given bin.

    Throws:
    ArrayIndexOutOfBoundsException - if binIndex < 0 || binIndex >= bins().

    percentFromTo

    public double percentFromTo(double from,
                                double to)
    Returns the percentage of elements in the range (from,to]. Does linear interpolation.

    Parameters:
    from - the start point (exclusive).
    to - the end point (inclusive).
    Returns:
    a number in the closed interval [0.0,1.0].

    phi

    public double phi(double element)
    Returns how many percent of the elements contained in the receiver are <= element. Does linear interpolation.

    Parameters:
    element - the element to search for.
    Returns:
    a number in the closed interval [0.0,1.0].

    size

    public int size()
    Deprecated. Deprecated. Returns the number of bin boundaries.


    startOfBin

    public double startOfBin(int binIndex)
    Returns the start of the range associated with the given bin.

    Throws:
    ArrayIndexOutOfBoundsException - if binIndex < 0 || binIndex >= bins().

    test

    public static void test(double element)
    Not yet commented.


    Parallel Colt 0.7.2

    Jump to the Parallel Colt Homepage