The Dirichlet distribution is a multivariate continuous probability distribution often used to model the uncertainty about a vector of unknown probabilities.
The Dirichlet distribution is a multivariate generalization of the Beta distribution.
Denote by the probability of an event. If is unknown, we can treat it as a random variable, and assign a Beta distribution to .
If is a vector of unknown probabilities of mutually exclusive events, we can treat as a random vector and assign a Dirichlet distribution to it.
The Dirichlet distribution is characterized as follows.
Definition Let be a continuous random vector. Let its support beLet . We say that has a Dirichlet distribution with parameters if and only if its joint probability density function iswhere the normalizing constant is and is the Gamma function.
In the above definition, the entries of the vector are probabilities whose sum is less than or equal to 1:
If we want to have a vector of probabilities exactly summing up to 1, we can define an additional probability so that
However, there is no way to rigorously define a probability density for the vectorbecause the constraint in equation (2) implies that the probability density should be zero everywhere on except on a subset whose Lebesgue measure is equal to zero, and on the latter set the probability density should be infinite (something involving a Dirac delta function).
Therefore, the right way to deal with events whose probabilities sum up to 1 is to:
assign a Dirichlet density, as defined above, to the probabilities of events ().
define the probability of the -th event as in equation (1).
We notice that several sources (including the Wikipedia page on the Dirichlet distribution) are not entirely clear about this point.
How do we come up with the above formula for the density of the Dirichlet distribution?
The next proposition provides some insights.
Proposition Let be independent Gamma random variables having means and degrees-of-freedom parameters . DefineThen, the random vectorhas a Dirichlet distribution with parameters .
A Gamma random variable is supported on the set of positive real numbers. Moreover,andTherefore, the support of coincides with that of a Dirichlet random vector. The probability density of a Gamma random variable with mean parameter and degrees-of-freedom parameter isSince the variables are independent, their joint probability density isConsider the one-to-one transformationwhose inverse isThe Jacobian matrix of isThe determinant of the Jacobian isbecause: 1) the determinant does not change if we add the first rows to the -th row; 2) the determinant of a triangular matrix is equal to the product of its diagonal entries. The formula for the joint probability density of a one-to-one transformation gives us (on the support of ):By integrating out , we obtainwhere in step we have used the definition of the Gamma function. The latter expression is the density of the Dirichlet distribution with parameters .
The Beta distribution is a special case of the Dirichlet distribution.
If we set the dimension in the definition above, the support becomes and the probability density function becomes
By using the definition of the Beta functionwe can re-write the density as
But this is the density of a Beta random variable with parameters and .
The following proposition is often used to prove interesting results about the Dirichlet distribution.
Proposition Let be a Dirichlet random vector with parameters . Let be any integer such that . Then, the the marginal distribution of the subvectoris a Dirichlet distribution with parameters .
First of all, notice that if the proposition holds for , then we can use it recursively to show that it holds for all the other possible values of . So, we assume . In order to derive the marginal distribution, we need to integrate out of the joint density of :whereand we have used indicator functions to specify the support of ; for example, is equal to 1 if and to 0 otherwise. We can re-write the marginal density asAfter definingwe can solve the integral as follows:where: in step we made the change of variable ; in step we used the integral representation of the Beta function; in step we used the relation between the Beta and Gamma functions. Thus, we havewhich is the density of a -dimensional Dirichlet distribution with parameters .
A corollary of the previous two propositions follows.
Proposition Let be a Dirichlet random vector with parameters . Then, the marginal distribution of the -th entry of is a Beta distribution with parameters and where
The expected value of a Dirichlet random vector is
We know that the marginal distribution of each entry of is a Beta distribution. Therefore, we can use, for each entry, the formula for the expected value of a Beta random variable:
The cross-moments of a Dirichlet random vector are where are non-negative integers.
The formula is derived as follows:In the last step we have used the fact that the expression inside the integral is the joint probability density of a Dirichlet distribution with parameters
The entries of the covariance matrix of a Dirichlet random vector arewhere
We can use the covariance formula and its special casetogether with the formulae for the expected value and the cross-moments derived previously. When , we havewhere we have used the property of the Gamma functionand we have definedTherefore, for , we haveWhen , we haveand
Please cite as:
Taboga, Marco (2021). "Dirichlet distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/Dirichlet-distribution.
Most of the learning materials found on this website are now available in a traditional textbook format.