In probability, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, success (with probability p) or failure (with probability q = 1 − p). A single success/failure experiment is also called a Bernoulli trial, so for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution.
In general, if the random variable X follows the binomial distribution with parameters n ∈ ℕ and p ∈ [0,1], we write X ~ B(n, p). The probability of getting exactly k successes in n trials is given by the probability mass function:
for k = 0, 1, 2, …, n
where is the so called binomial coefficent. A binomial coefficient is indexed by a pair of integers n ≥ k ≥ 0 and is written like
It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n, and it is given by the formula shown above.
The binomial coefficients occur in many areas of mathematics, especially in the field of combinatorics. is often read aloud as “n choose k“, because there are
ways to choose a subset of size k elements, disregarding their order, from a set of n elements. The properties of binomial coefficients have led to extending the definition to beyond the common case of integers n ≥ k ≥ 0.
The formula can be understood as follows: k successes occur with probability pk and n − k failures occur with probability (1 − p)n − k. However, there can be permutations of the trials, such that the k successes can occur anywhere among the n trials; in other words, there aredifferent ways of distributing k successes in a sequence of n trials.

If X ~ B(n, p), that is, X is a binomially distributed random variable, n being the total number of experiments and p the probability of each experiment yielding a successful result, then the expected value of X is:
For example, if n = 100, and p =1/4, then the average number of successful results will be 25.
Proof: We calculate the mean, μ, directly calculated from its definition.
so, given the definition the proof is as it follows:

This is possible thanks to the Binomial theorem. According to the theorem, it is possible to expand the polynomial (x + y)n into a sum involving terms of the forma xb yc, where the exponents b and c are nonnegative integers with b + c = n, and the coefficient a of each term is a specific positive integer depending on n and b. For example,
The coefficient a in the term of a xb yc is known as the binomial coefficient or
(the two have the same value). These numbers also arise in combinatorics, where
gives the number of different combinations of b elements that can be chosen from an n-element set.
It is also possible to deduce the mean from the equation whereby all
are Bernoulli distributed random variables with
.
We get
As well as with the mean, it will be easy to show how the Binomial Variance can be calculated. The variance is:
Proof:
Let where all
are independently Bernoulli distributed random variables. Since
, we get:
Gven the Binomial distribution, there exist different derivations, which is interesting to talk about:
-
Sums of binomials
If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables with the same probability p, then X + Y is again a binomial variable; its distribution is Z=X+Y ~ B(n+m, p):
However, if X and Y do not have the same probability p, then the variance of the sum will be smaller than the variance of a binomial variable distributed as
-
Bernoulli distribution
As we said before, the Bernoulli distribution is a special case of the binomial distribution, where n = 1. Symbolically, X ~ B(1, p) has the same meaning as X ~ B(p). Conversely, any binomial distribution, B(n, p), is the distribution of the sum of n Bernoulli trials, B(p), each with the same probability p.
-
Poisson binomial distribution
The binomial distribution is a special case of the Poisson binomial distribution, or general binomial distribution, which is the distribution of a sum of n independent non-identical Bernoulli trials B(pi).
-
Normal approximation

Binomial probability mass function and normal probability density function approximation for n = 6 and p = 0.5 If n is large enough, then the skew of the distribution is not too great. In this case a reasonable approximation to B(n, p) is given by the normal distribution
and this basic approximation can be improved using a suitable continuity correction. The basic approximation generally improves as n increases (at least 20) and is better when p is not near to 0 or 1.
An interesting aspect as for the Binomial derivations are the Limiting distributions:
- Poisson limit theorem: As n approaches ∞ and p approaches 0, then the Binomial(n, p) distribution approaches the Poisson distribution with expected value λ = np.
- De Moivre–Laplace theorem: As n approaches ∞ while p remains fixed,
- the distribution of approaches the normal distribution with expected value 0 and variance 1.
This result is sometimes loosely stated by saying that the distribution of X is asymptotically normal with expected value np and variance np(1 − p). - This result is a specific case of the central limit theorem, about which we’ll talk in the next research.
