Relationships among probability distributions

Relationships among some of univariate probability distributions are illustrated with connected lines. dashed lines means approximate relationship. more info:^[1] Note: The transformation from N(mu, sigma^2) to N(0, 1) should be (X - mu)/sigma, not (X - mu)/sigma^2

Relationships between univariate probability distributions in ProbOnto.^[2]

In probability theory and statistics, there are several relationships among probability distributions. These relations can be categorized in the following groups:

One distribution is a special case of another with a broader parameter space
Transforms (function of a random variable);
Combinations (function of several variables);
Approximation (limit) relationships;
Compound relationships (useful for Bayesian inference);
Duality^{[clarification needed]};
Conjugate priors.

Special case of distribution parametrization[]

A binomial distribution with parameters n = 1 and p is a Bernoulli distribution with parameter p.
A negative binomial distribution with parameters n = 1 and p is a geometric distribution with parameter p.
A gamma distribution with shape parameter α = 1 and rate parameter β is an exponential distribution with rate parameter β.
A gamma distribution with shape parameter α = v/2 and rate parameter β = 1/2 is a chi-squared distribution with ν degrees of freedom.
A chi-squared distribution with 2 degrees of freedom (k = 2) is an exponential distribution with a mean value of 2 (rate λ = 1/2 .)
A Weibull distribution with shape parameter k = 1 and rate parameter β is an exponential distribution with rate parameter β.
A beta distribution with shape parameters α = β = 1 is a continuous uniform distribution over the real numbers 0 to 1.
A beta-binomial distribution with parameter n and shape parameters α = β = 1 is a discrete uniform distribution over the integers 0 to n.
A Student's t-distribution with one degree of freedom (v = 1) is a Cauchy distribution with location parameter x = 0 and scale parameter γ = 1.
A Burr distribution with parameters c = 1 and k (and scale λ) is a Lomax distribution with shape k (and scale λ.)

Transform of a variable[]

Multiple of a random variable[]

Multiplying the variable by any positive real constant yields a scaling of the original distribution. Some are self-replicating, meaning that the scaling yields the same family of distributions, albeit with a different parameter: normal distribution, gamma distribution, Cauchy distribution, exponential distribution, Erlang distribution, Weibull distribution, logistic distribution, error distribution, power-law distribution, Rayleigh distribution.

Example:

If X is a gamma random variable with shape and rate parameters (α, β), then Y = aX is a gamma random variable with parameters (α,β/a).

If X is a gamma random variable with shape and scale parameters (k, θ), then Y = aX is a gamma random variable with parameters (k,aθ).

Linear function of a random variable[]

The affine transform ax + b yields a relocation and scaling of the original distribution. The following are self-replicating: Normal distribution, Cauchy distribution, Logistic distribution, Error distribution, Power distribution, Rayleigh distribution.

Example:

If Z is a normal random variable with parameters (μ = m, σ² = s²), then X = aZ + b is a normal random variable with parameters (μ = am + b, σ² = a²s²).

Reciprocal of a random variable[]

The reciprocal 1/X of a random variable X, is a member of the same family of distribution as X, in the following cases: Cauchy distribution, F distribution, log logistic distribution.

Examples:

If X is a Cauchy (μ, σ) random variable, then 1/X is a Cauchy (μ/C, σ/C) random variable where C = μ² + σ².
If X is an F(ν₁, ν₂) random variable then 1/X is an F(ν₂, ν₁) random variable.

Other cases[]

Some distributions are invariant under a specific transformation.

Example:

If X is a beta (α, β) random variable then (1 − X) is a beta (β, α) random variable.
If X is a binomial (n, p) random variable then (n − X) is a binomial (n, 1 − p) random variable.
If X has cumulative distribution function F_X, then the inverse of the cumulative distribution F
_X(X) is a standard uniform (0,1) random variable
If X is a normal (μ, σ²) random variable then e^X is a lognormal (μ, σ²) random variable.

Conversely, if X is a lognormal (μ, σ²) random variable then log X is a normal (μ, σ²) random variable.

If X is an exponential random variable with mean β, then X^1/γ is a Weibull (γ, β) random variable.
The square of a standard normal random variable has a chi-squared distribution with one degree of freedom.
If X is a Student’s t random variable with ν degree of freedom, then X² is an F (1,ν) random variable.
If X is a double exponential random variable with mean 0 and scale λ, then |X| is an exponential random variable with mean λ.
A geometric random variable is the floor of an exponential random variable.
A rectangular random variable is the floor of a uniform random variable.
A reciprocal random variable is the exponential of a uniform random variable.

Functions of several variables[]

Sum of variables[]

The distribution of the sum of independent random variables is the convolution of their distributions. Suppose $Z$ is the sum of $n$ independent random variables $X_{1},\dots ,X_{n}$ each with probability mass functions $f_{X_{i}}(x)$ . Then

Z=\sum _{i=1}^{n}{X_{i}}

has

If it has a distribution from the same family of distributions as the original variables, that family of distributions is said to be closed under convolution.

Examples of such univariate distributions are: normal distributions, Poisson distributions, binomial distributions (with common success probability), negative binomial distributions (with common success probability), gamma distributions (with common rate parameter), chi-squared distributions, Cauchy distributions, hyperexponential distributions.

Examples:^[3]^[4]

- If X₁ and X₂ are Poisson random variables with means μ₁ and μ₂ respectively, then X₁ + X₂ is a Poisson random variable with mean μ₁ + μ₂.
- The sum of gamma (α_i, β) random variables has a gamma (Σα_i, β) distribution.
- If X₁ is a Cauchy (μ₁, σ₁) random variable and X₂ is a Cauchy (μ₂, σ₂), then X₁ + X₂ is a Cauchy (μ₁ + μ₂, σ₁ + σ₂) random variable.
- If X₁ and X₂ are chi-squared random variables with ν₁ and ν₂ degrees of freedom respectively, then X₁ + X₂ is a chi-squared random variable with ν₁ + ν₂ degrees of freedom.
- If X₁ is a normal (μ₁, σ²
  ₁) random variable and X₂ is a normal (μ₂, σ²
  ₂) random variable, then X₁ + X₂ is a normal (μ₁ + μ₂, σ²
  ₁ + σ²
  ₂) random variable.
- The sum of N chi-squared (1) random variables has a chi-squared distribution with N degrees of freedom.

Other distributions are not closed under convolution, but their sum has a known distribution:

The sum of n Bernoulli (p) random variables is a binomial (n, p) random variable.
The sum of n geometric random variables with probability of success p is a negative binomial random variable with parameters n and p.
The sum of n exponential (β) random variables is a gamma (n, β) random variable. Since n is an integer, the gamma distribution is also a Erlang distribution.
The sum of the squares of N standard normal random variables has a chi-squared distribution with N degrees of freedom.

Product of variables[]

The product of independent random variables X and Y may belong to the same family of distribution as X and Y: Bernoulli distribution and log-normal distribution.

Example:

If X₁ and X₂ are independent log-normal random variables with parameters (μ₁, σ²
₁) and (μ₂, σ²
₂) respectively, then X₁ X₂ is a log-normal random variable with parameters (μ₁ + μ₂, σ²
₁ + σ²
₂).

(See also Product distribution.)

Minimum and maximum of independent random variables[]

For some distributions, the minimum value of several independent random variables is a member of the same family, with different parameters: Bernoulli distribution, Geometric distribution, Exponential distribution, Extreme value distribution, Pareto distribution, Rayleigh distribution, Weibull distribution.

Examples:

If X₁ and X₂ are independent geometric random variables with probability of success p₁ and p₂ respectively, then min(X₁, X₂) is a geometric random variable with probability of success p = p₁ + p₂ − p₁ p₂. The relationship is simpler if expressed in terms probability of failure: q = q₁ q₂.
If X₁ and X₂ are independent exponential random variables with rate μ₁ and μ₂ respectively, then min(X₁, X₂) is an exponential random variable with rate μ = μ₁ + μ₂.

Similarly, distributions for which the maximum value of several independent random variables is a member of the same family of distribution include: Bernoulli distribution, Power law distribution.

Other[]

If X and Y are independent standard normal random variables, X/Y is a Cauchy (0,1) random variable.
If X₁ and X₂ are independent chi-squared random variables with ν₁ and ν₂ degrees of freedom respectively, then (X₁/ν₁)/(X₂/ν₂) is an F(ν₁, ν₂) random variable.
If X is a standard normal random variable and U is an independent chi-squared random variable with ν degrees of freedom, then ${\frac {X}{\sqrt {(U/\nu )}}}$ is a Student's t(ν) random variable.
If X₁ is a gamma (α₁, 1) random variable and X₂ is an independent gamma (α₂, 1) random variable then X₁/(X₁ + X₂) is a beta(α₁, α₂) random variable. More generally, if X₁ is a gamma(α₁, β₁) random variable and X₂ is an independent gamma(α₂, β₂) random variable then β₂ X₁/(β₂ X₁ + β₁ X₂) is a beta(α₁, α₂) random variable.
If X and Y are independent exponential random variables with mean μ, then X − Y is a double exponential random variable with mean 0 and scale μ.
If X_i are independent Bernoulli random variables then their parity (XOR) is a Bernoulli variable described by the piling-up lemma.

(See also ratio distribution.)

Approximate (limit) relationships[]

Approximate or limit relationship means

either that the combination of an infinite number of iid random variables tends to some distribution,
or that the limit when a parameter tends to some value approaches to a different distribution.

Combination of iid random variables:

Given certain conditions, the sum (hence the average) of a sufficiently large number of iid random variables, each with finite mean and variance, will be approximately normally distributed. This is the central limit theorem (CLT).

Special case of distribution parametrization:

X is a hypergeometric (m, N, n) random variable. If n and m are large compared to N, and p = m/N is not close to 0 or 1, then X approximately has a Binomial(n, p) distribution.
X is a beta-binomial random variable with parameters (n, α, β). Let p = α/(α + β) and suppose α + β is large, then X approximately has a binomial(n, p) distribution.
If X is a binomial (n, p) random variable and if n is large and np is small then X approximately has a Poisson(np) distribution.
If X is a negative binomial random variable with r large, P near 1, and r(1 − P) = λ, then X approximately has a Poisson distribution with mean λ.

Consequences of the CLT:

If X is a Poisson random variable with large mean, then for integers j and k, P(j ≤ X ≤ k) approximately equals to P(j − 1/2 ≤ Y ≤ k + 1/2) where Y is a normal distribution with the same mean and variance as X.
If X is a binomial(n, p) random variable with large np and n(1 − p), then for integers j and k, P(j ≤ X ≤ k) approximately equals to P(j − 1/2 ≤ Y ≤ k + 1/2) where Y is a normal random variable with the same mean and variance as X, i.e. np and np(1 − p).
If X is a beta random variable with parameters α and β equal and large, then X approximately has a normal distribution with the same mean and variance, i. e. mean α/(α + β) and variance αβ/((α + β)²(α + β + 1)).
If X is a gamma(α, β) random variable and the shape parameter α is large relative to the scale parameter β, then X approximately has a normal random variable with the same mean and variance.
If X is a Student's t random variable with a large number of degrees of freedom ν then X approximately has a standard normal distribution.
If X is an F(ν, ω) random variable with ω large, then νX is approximately distributed as a chi-squared random variable with ν degrees of freedom.

Compound (or Bayesian) relationships[]

When one or more parameter(s) of a distribution are random variables, the compound distribution is the marginal distribution of the variable.

Examples:

If X | N is a binomial (N,p) random variable, where parameter N is a random variable with negative-binomial (m, r) distribution, then X is distributed as a negative-binomial (m, r/(p + qr)).
If X | N is a binomial (N,p) random variable, where parameter N is a random variable with Poisson(μ) distribution, then X is distributed as a Poisson (μp).
If X | μ is a Poisson(μ) random variable and parameter μ is random variable with gamma(m, θ) distribution (where θ is the scale parameter), then X is distributed as a negative-binomial (m, θ/(1 + θ)), sometimes called gamma-Poisson distribution.

Some distributions have been specially named as compounds: beta-binomial distribution, , gamma-normal distribution.

Examples:

If X is a Binomial(n,p) random variable, and parameter p is a random variable with beta(α, β) distribution, then X is distributed as a Beta-Binomial(α,β,n).
If X is a negative-binomial(m,p) random variable, and parameter p is a random variable with beta(α,β) distribution, then X is distributed as a Beta-Pascal(α,β,m).

References[]

^ LEEMIS, Lawrence M.; Jacquelyn T. MCQUESTON (February 2008). "Univariate Distribution Relationships" (PDF). American Statistician. 62 (1): 45–53. doi:10.1198/000313008x270448.
^ Swat, MJ; Grenon, P; Wimalaratne, S (2016). "ProbOnto: ontology and knowledge base of probability distributions". Bioinformatics. 32 (17): 2719–21. doi:10.1093/bioinformatics/btw170. PMC 5013898. PMID 27153608.
^ Cook, John D. "Diagram of distribution relationships".
^ Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis; Kalinin, Alex; Christou, Nicolas (2015). "Probability Distributome: a web computational infrastructure for exploring the properties, interrelations, and applications of probability distributions". Computational Statistics. 594 (2): 249–271. doi:10.1007/s00180-015-0594-6. PMC 4856044. PMID 27158191.

External links[]

Interactive graphic: Univariate Distribution Relationships
ProbOnto - Ontology and knowledge base of probability distributions: ProbOnto
Probability Distributome project includes calculators, simulators, experiments, and navigators for inter-distributional refashions and distribution meta-data.

[1] LEEMIS, Lawrence M.; Jacquelyn T. MCQUESTON (February 2008). "Univariate Distribution Relationships" (PDF). American Statistician. 62 (1): 45–53. doi:10.1198/000313008x270448.

[2] Swat, MJ; Grenon, P; Wimalaratne, S (2016). "ProbOnto: ontology and knowledge base of probability distributions". Bioinformatics. 32 (17): 2719–21. doi:10.1093/bioinformatics/btw170. PMC 5013898. PMID 27153608.

[3] Cook, John D. "Diagram of distribution relationships".

[4] Dinov, Ivo D.; Siegrist, Kyle; Pearl, Dennis; Kalinin, Alex; Christou, Nicolas (2015). "Probability Distributome: a web computational infrastructure for exploring the properties, interrelations, and applications of probability distributions". Computational Statistics. 594 (2): 249–271. doi:10.1007/s00180-015-0594-6. PMC 4856044. PMID 27158191.

[1]

[2]

[3]

[4]

Relationships among probability distributions

Contents

Special case of distribution parametrization[]

Transform of a variable[]

Multiple of a random variable[]

Linear function of a random variable[]

Reciprocal of a random variable[]

Other cases[]

Functions of several variables[]

Sum of variables[]

Product of variables[]

Minimum and maximum of independent random variables[]

Other[]

Approximate (limit) relationships[]

Compound (or Bayesian) relationships[]

See also[]

References[]

External links[]