The multinomial distribution is a multivariate discrete distribution that generalizes the binomial distribution.
If you perform times a probabilistic experiment that can have only two outcomes, then the number of times you obtain one of the two outcomes is a binomial random variable.
If you perform times an experiment that can have outcomes ( can be any natural number) and you denote by the number of times that you obtain the -th outcome, then the random vector defined asis a multinomial random vector.
A multinomial vector can be seen as a sum of mutually independent Multinoulli random vectors.
This connection between the multinomial and Multinoulli distributions will be illustrated in detail in the rest of this lecture and will be used to demonstrate several properties of the multinomial distribution.
For this reason, we highly recommend to study the Multinoulli distribution before reading the following sections.
Multinomial random vectors are characterized as follows.
Definition Let be a discrete random vector. Let . Let the support of be the set of vectors having non-negative integer entries summing up to :Let , ..., be strictly positive numbers such thatWe say that has a multinomial distribution with probabilities , ..., and number of trials , if its joint probability mass function iswhere is the multinomial coefficient.
The connection between the multinomial and the Multinoulli distribution is illustrated by the following propositions.
Proposition If a random variable has a multinomial distribution with probabilities , ..., and number of trials , then it has a Multinoulli distribution with probabilities , ..., .
The support of is and its joint probability mass function isButbecause, for each , either or and . As a consequence,which is the joint probability mass function of a Multinoulli distribution.
Proposition A random vector having a multinomial distribution with parameters and can be written aswhere are independent random vectors all having a Multinoulli distribution with parameters .
The sum is equal to the vector whenProvided for each and , there are several different realizations of the vector satisfying these conditions. Since are Multinoulli variables, each of these realizations has probability(see also the proof of the previous proposition). Furthermore, the number of the realizations satisfying the above conditions is equal to the number of partitions of objects into groups having numerosities (see the lecture entitled Partitions), which in turn is equal to the multinomial coefficient Therefore,which proves that and have the same distribution.
The expected value of a multinomial random vector iswhere the vector is defined as follows:
Using the fact that can be written as a sum of Multinoulli variables with parameters , we obtainwhere is the expected value of a Multinoulli random variable.
The covariance matrix of a multinomial random vector iswhere is a matrix whose generic entry is
Since can be represented as a sum of independent Multinoulli random variables with parameters , we obtain
The joint moment generating function of a multinomial random vector is defined for any :
Since can be written as a sum of independent Multinoulli random vectors with parameters , the joint moment generating function of is derived from that of the summands:
The joint characteristic function of is
The derivation is similar to the derivation of the joint moment generating function (see above):
Below you can find some exercises with explained solutions.
A shop selling two items, labeled A and B, needs to construct a probabilistic model of the sales that will be generated by its next 10 customers. Each time a customer arrives, only three outcomes are possible: 1) nothing is sold; 2) one unit of item A is sold; 3) one unit of item B is sold. It has been estimated that the probabilities of these three outcomes are 0.50, 0.25 and 0.25 respectively. Furthermore, the shopping behavior of a customer is independent of the shopping behavior of all other customers. Denote by a vector whose entries and are equal to the number of times each of the three outcomes occurs. Derive the expected value and the covariance matrix of .
The vector has a multinomial distribution with parametersand . Therefore, its expected value isand its covariance matrix is
Given the assumptions made in the previous exercise, suppose that item A costs $1,000 and item B costs $2,000. Derive the expected value and the variance of the total revenue generated by the 10 customers.
The total revenue can be written as a linear transformation of the vector :whereBy the linearity of the expected value operator, we obtainBy using the formula for the covariance matrix of a linear transformation, we obtain
Please cite as:
Taboga, Marco (2021). "Multinomial distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/multinomial-distribution.
Most of the learning materials found on this website are now available in a traditional textbook format.