Parallel Colt 0.7.2

Package hep.aida.tdouble.bin

Multisets (bags) with efficient statistics operations defined upon; This package requires the Colt distribution.

See:
          Description

Interface Summary
DoubleBinBinFunction1D Interface that represents a function object: a function that takes two bins as arguments and returns a single value.
DoubleBinFunction1D Interface that represents a function object: a function that takes two bins as arguments and returns a single value.
 

Class Summary
AbstractDoubleBin Abstract base class for all arbitrary-dimensional bins consumes double elements.
AbstractDoubleBin1D Abstract base class for all 1-dimensional bins consumes double elements.
DoubleBinFunctions1D Function objects computing dynamic bin aggregations; to be passed to generic methods.
DynamicDoubleBin1D 1-dimensional rebinnable bin holding double elements; Efficiently computes advanced statistics of data sequences.
MightyStaticDoubleBin1D Static and the same as its superclass, except that it can do more: Additionally computes moments of arbitrary integer order, harmonic mean, geometric mean, etc.
QuantileDoubleBin1D 1-dimensional non-rebinnable bin holding double elements with scalable quantile operations defined upon; Using little main memory, quickly computes approximate quantiles over very large data sequences with and even without a-priori knowledge of the number of elements to be filled; Conceptually a strongly lossily compressed multiset (or bag); Guarantees to respect the worst case approximation error specified upon instance construction.
StaticDoubleBin1D 1-dimensional non-rebinnable bin consuming double elements; Efficiently computes basic statistics of data sequences.
 

Package hep.aida.tdouble.bin Description

Multisets (bags) with efficient statistics operations defined upon; This package requires the Colt distribution.

Bins contain information about the data filled into them. They can be asked for various descriptive statistical measures, such as the minimum, maximum, size, mean, rms, variance, etc.

Static vs. Dynamic bins and their space-time-functionality trade-offs

Bins come in two flavours: Dynamic and Static. Dynamic bins preserve all the values filled into them and can return these exact values, when asked to do so. They are rebinnable.
Static bins do not preserve the values filled into them. They merely collect basic statistics incrementally while they are being filled. They immediately forget about the filled values and keep only the derived statistics. They are not rebinnable.

The data filled into static bins is not preserved. As a consequence infinitely many elements can be added to such bins. As a further consequence such bins cannot compute more than basic statistics. They are also not rebinnable. If these drawbacks matter, consider to use a DynamicDoubleBin1D, which overcomes them at the expense of increased memory requirements.

The data filled into dynamic bins is fully preserved. Technically speaking, they are k-dimensional multisets (or bags) with efficient statistics operations defined upon. As a consequence such bins can compute more than only basic statistics. They are also rebinnable. On the other hand side, if many elements are filled into them, one may quickly run out of memory (each double element takes 8 bytes). If these drawbacks matter, consider to use a StaticDoubleBin1D, which overcomes them at the expense of limited functionality.

Available bin types

All bins are derived from a common abstract base class AbstractDoubleBin. This base class is extended by AbstractDoubleBin1D, the common abstract base class for 1-dimensional bins.

Static 1-dimensional bins currently offered are: StaticDoubleBin1D, MightyStaticDoubleBin1D and QuantileDoubleBin1D.
Dynamic 1-dimensional bins currently offered are: DynamicDoubleBin1D.

Advanced statistics on dynamic bins

In case not each and every statistics measure needed is directly provided by methods of bins one can use dynamic bins and retrieve their filled elements. From these elements, one can compute whatever necessary, either by using a statistics library or self written functions. Use methods like DynamicDoubleBin1D.elements() and, for example, the descriptive statistics library DoubleDescriptive.


Parallel Colt 0.7.2

Jump to the Parallel Colt Homepage