tango.math.Probability

Cumulative Probability Distribution Functions

License:

BSD style: see license.txt

Authors:

Stephen L. Moshier (original C code), Don Clugston
real normalDistribution(real a) #
real normalDistributionCompl(real a) #
Cumulative distribution function for the Normal distribution, and its complement.
The normal (or Gaussian, or bell-shaped) distribution is defined as:

normalDist(x) = 1/√ π ∫ exp( - t2/2) dt = 0.5 + 0.5 * erf(x/sqrt(2)) = 0.5 * erfc(- x/sqrt(2))

Note that normalDistribution(x) = 1 - normalDistribution(-x).

Accuracy:

Within a few bits of machine resolution over the entire range.

References:

http://www.netlib.org/cephes/ldoubdoc.html, G. Marsaglia, "Evaluating the Normal Distribution", Journal of Statistical Software 11, (July 2004).
real normalDistributionInv(real p) #
real normalDistributionComplInv(real p) #
Inverse of Normal distribution function
Returns the argument, x, for which the area under the Normal probability density function (integrated from minus infinity to x) is equal to p.

For small arguments 0 < p < exp(-2), the program computes z = sqrt( -2 log(p) ); then the approximation is x = z - log(z)/z - (1/z) P(1/z) / Q(1/z) . For larger arguments, x/sqrt(2 pi) = w + w^3 R(w^2)/S(w^2)) , where w = p - 0.5 .

real studentsTDistribution(int nu, real t) #
Student's t cumulative distribution function
Computes the integral from minus infinity to t of the Student t distribution with integer nu > 0 degrees of freedom:

Γ( (nu+1)/2) / ( sqrt(nu π) Γ(nu/2) ) * -∞t (1+x2/nu)-(nu+1)/2 dx

Can be used to test whether the means of two normally distributed populations are equal.

It is related to the incomplete beta integral: 1 - studentsDistribution(nu,t) = 0.5 * betaDistribution( nu/2, 1/2, z ) where z = nu/(nu + t2).

For t < -1.6, this is the method of computation. For higher t, a direct method is derived from integration by parts. Since the function is symmetric about t=0, the area under the right tail of the density is found by calling the function with -t instead of t.

real studentsTDistributionInv(int nu, real p) #
Inverse of Student's t distribution
Given probability p and degrees of freedom nu, finds the argument t such that the one-sided studentsDistribution(nu,t) is equal to p.

Params:

nudegrees of freedom. Must be >1
pprobability. 0 < p < 1
real fDistribution(int df1, int df2, real x) #
real fDistributionCompl(int df1, int df2, real x) #
real fDistributionComplInv(int df1, int df2, real p) #
The F distribution, its complement, and inverse.
The F density function (also known as Snedcor's density or the variance ratio density) is the density of x = (u1/df1)/(u2/df2), where u1 and u2 are random variables having χ2 distributions with df1 and df2 degrees of freedom, respectively.

fDistribution returns the area from zero to x under the F density function. The complementary function, fDistributionCompl, returns the area from x to ∞ under the F density function.

The inverse of the complemented F distribution, fDistributionComplInv, finds the argument x such that the integral from x to infinity of the F density is equal to the given probability y.

Can be used to test whether the means of multiple normally distributed populations, all with the same standard deviation, are equal; or to test that the standard deviations of two normally distributed populations are equal.

Params:

df1Degrees of freedom of the first variable. Must be >= 1
df2Degrees of freedom of the second variable. Must be >= 1
xMust be >= 0
real chiSqrDistribution(real v, real x) #
real chiSqrDistributionCompl(real v, real x) #
χ2 cumulative distribution function and its complement.
Returns the area under the left hand tail (from 0 to x) of the Chi square probability density function with v degrees of freedom. The complement returns the area under the right hand tail (from x to ∞).

chiSqrDistribution(x | v) = (0x tv/2-1 e-t/2 dt ) / 2v/2 Γ(v/2)

chiSqrDistributionCompl(x | v) = (x tv/2-1 e-t/2 dt ) / 2v/2 Γ(v/2)

Params:

vdegrees of freedom. Must be positive.
xthe χ2 variable. Must be positive.
real chiSqrDistributionComplInv(real v, real p) #
Inverse of complemented χ2 distribution
Finds the χ2 argument x such that the integral from x to ∞ of the χ2 density is equal to the given cumulative probability p.

Params:

pCumulative probability. 0<= p <=1.
vDegrees of freedom. Must be positive.
real gammaDistribution(real a, real b, real x) #
real gammaDistributionCompl(real a, real b, real x) #
The Γ distribution and its complement
The Γ distribution is defined as the integral from 0 to x of the gamma probability density function. The complementary function returns the integral from x to ∞

gammaDistribution = (0x tb-1e-at dt) ab/Γ(b)

x must be greater than 0.

real betaDistribution(real a, real b, real x) #
real betaDistributionCompl(real a, real b, real x) #
real betaDistributionInv(real a, real b, real y) #
real betaDistributionComplInv(real a, real b, real y) #
Beta distribution and its inverse
Returns the incomplete beta integral of the arguments, evaluated from zero to x. The function is defined as

betaDistribution = Γ(a+b)/(Γ(a) Γ(b)) * 0x ta-1(1-t)b-1 dt

The domain of definition is 0 <= x <= 1. In this implementation a and b are restricted to positive values. The integral from x to 1 may be obtained by the symmetry relation

betaDistributionCompl(a, b, x ) = betaDistribution( b, a, 1-x )

The integral is evaluated by a continued fraction expansion or, when b*x is small, by a power series.

The inverse finds the value of x for which betaDistribution(a,b,x) - y = 0

real poissonDistribution(int k, real m) #
real poissonDistributionCompl(int k, real m) #
real poissonDistributionInv(int k, real p) #
The Poisson distribution, its complement, and inverse
k is the number of events. m is the mean. The Poisson distribution is defined as the sum of the first k terms of the Poisson density function. The complement returns the sum of the terms k+1 to ∞.

poissonDistribution = Σ kj=0 e-m mj/j!

poissonDistributionCompl = Σ j=k+1 e-m mj/j!

The terms are not summed directly; instead the incomplete gamma integral is employed, according to the relation

y = poissonDistribution( k, m ) = gammaIncompleteCompl( k+1, m ).

The arguments must both be positive.

real binomialDistribution(int k, int n, real p) #
real binomialDistributionCompl(int k, int n, real p) #
Binomial distribution and complemented binomial distribution
The binomial distribution is defined as the sum of the terms 0 through k of the Binomial probability density. The complement returns the sum of the terms k+1 through n.

binomialDistribution = Σ kj=0 ( nj ) pj (1-p)n-j

binomialDistributionCompl = Σ nj=k+1 ( nj ) pj (1-p)n-j

The terms are not summed directly; instead the incomplete beta integral is employed, according to the formula

y = binomialDistribution( k, n, p ) = betaDistribution( n-k, k+1, 1-p ).

The arguments must be positive, with p ranging from 0 to 1, and k<=n.

real binomialDistributionInv(int k, int n, real y) #
Inverse binomial distribution
Finds the event probability p such that the sum of the terms 0 through k of the Binomial probability density is equal to the given cumulative probability y.

This is accomplished using the inverse beta integral function and the relation

1 - p = betaDistributionInv( n-k, k+1, y ).

The arguments must be positive, with 0 <= y <= 1, and k <= n.

real negativeBinomialDistribution(int k, int n, real p) #
real negativeBinomialDistributionInv(int k, int n, real p) #
Negative binomial distribution and its inverse
Returns the sum of the terms 0 through k of the negative binomial distribution:

Σ kj=0 ( n+j-1j-1 ) pn (1-p)j

In a sequence of Bernoulli trials, this is the probability that k or fewer failures precede the n-th success.

The arguments must be positive, with 0 < p < 1 and r>0.

The inverse finds the argument y such that negativeBinomialDistribution(k,n,y) is equal to p.

The Geometric Distribution is a special case of the negative binomial distribution.

1
geometricDistribution(k, p) = negativeBinomialDistribution(k, 1, p);

References:

http://mathworld.wolfram.com/NegativeBinomialDistribution.html