Research VIII) Binomial distribution and its relatives

In probability, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, success (with probability p) or failure (with probability q = 1 − p). A single success/failure experiment is also called a Bernoulli trial, so for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution.

In general, if the random variable X follows the binomial distribution with parameters n ∈ ℕ and p ∈ [0,1], we write X ~ B(n, p). The probability of getting exactly k successes in n trials is given by the probability mass function:

Pr(k;n,p)=\Pr(X=k)={n \choose k}p^{k}(1-p)^{n-k}

for k = 0, 1, 2, …, n
where ${\binom {n}{k}}={\frac {n!}{k!(n-k)!}}$ is the so called binomial coefficent. A binomial coefficient is indexed by a pair of integers $n \geq k \geq 0$ and is written like ${\tbinom {n}{k}}.$ It is the coefficient of the $x k$ term in the polynomial expansion of the binomial power $(1 + x) n$ , and it is given by the formula shown above.

The binomial coefficients occur in many areas of mathematics, especially in the field of combinatorics. ${\tbinom {n}{k}}$ is often read aloud as “ $n$ choose $k$ “, because there are ${\tbinom {n}{k}}$ ways to choose a subset of size $k$ elements, disregarding their order, from a set of $n$ elements. The properties of binomial coefficients have led to extending the definition to beyond the common case of integers $n \geq k \geq 0$ .

The formula can be understood as follows: k successes occur with probability p^k and n − k failures occur with probability (1 − p)^n − k. However, there can be permutations of the trials, such that the k successes can occur anywhere among the n trials; in other words, there are ${n \choose k}$ different ways of distributing k successes in a sequence of n trials.

bin — a summary of the Binomial distribution characteristics

If X ~ B(n, p), that is, X is a binomially distributed random variable, n being the total number of experiments and p the probability of each experiment yielding a successful result, then the expected value of X is:

\operatorname {E} [X]=np.

For example, if n = 100, and p =1/4, then the average number of successful results will be 25.

Proof: We calculate the mean, μ, directly calculated from its definition.

\mu =\sum _{i=0}^{n}x_{i}p_{i},

so, given the definition the proof is as it follows:

This is possible thanks to the Binomial theorem. According to the theorem, it is possible to expand the polynomial $(x + y) n$ into a sum involving terms of the form $a x b y c$ , where the exponents $b$ and $c$ are nonnegative integers with $b + c = n$ , and the coefficient $a$ of each term is a specific positive integer depending on $n$ and $b$ . For example,

(x+y)^{4}=x^{4}+4x^{3}y+6x^{2}y^{2}+4xy^{3}+y^{4}.

The coefficient $a$ in the term of $a x b y c$ is known as the binomial coefficient ${\tbinom {n}{b}}$ or ${\tbinom {n}{c}}$ (the two have the same value). These numbers also arise in combinatorics, where ${\tbinom {n}{b}}$ gives the number of different combinations of $b$ elements that can be chosen from an $n$ -element set.

It is also possible to deduce the mean from the equation $X=X_{1}+\cdots +X_{n}$ whereby all $X_{i}$ are Bernoulli distributed random variables with $E[X_{i}]=p$ .
We get $E[X]=E[X_{1}+\cdots +X_{n}]=E[X_{1}]+\cdots +E[X_{n}]=\underbrace {p+\cdots +p} _{n{\text{ times}}}=np$

As well as with the mean, it will be easy to show how the Binomial Variance can be calculated. The variance is:

\operatorname {Var} (X)=np(1-p).

Proof:
Let $X=X_{1}+\cdots +X_{n}$ where all $X_{i}$ are independently Bernoulli distributed random variables. Since $\operatorname {Var} (X_{i})=p(1-p)$ , we get:

\operatorname {Var} (X)=\operatorname {Var} (X_{1}+\cdots +X_{n})=\operatorname {Var} (X_{1})+\cdots +\operatorname {Var} (X_{n})=n\operatorname {Var} (X_{1})=np(1-p).

Gven the Binomial distribution, there exist different derivations, which is interesting to talk about:

Sums of binomials

If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables with the same probability p, then X + Y is again a binomial variable; its distribution is Z=X+Y ~ B(n+m, p): ${\begin{aligned}\operatorname {P} (Z=k)&=\sum _{i=0}^{k}\left[{\binom {n}{i}}p^{i}(1-p)^{n-i}\right]\left[{\binom {m}{k-i}}p^{k-i}(1-p)^{m-k+i}\right]\\&={\binom {n+m}{k}}p^{k}(1-p)^{n+m-k}\end{aligned}}$

However, if X and Y do not have the same probability p, then the variance of the sum will be smaller than the variance of a binomial variable distributed as $B(n+m,{\bar {p}}).\,$
Bernoulli distribution

As we said before, the Bernoulli distribution is a special case of the binomial distribution, where n = 1. Symbolically, X ~ B(1, p) has the same meaning as X ~ B(p). Conversely, any binomial distribution, B(n, p), is the distribution of the sum of n Bernoulli trials, B(p), each with the same probability p.
Poisson binomial distribution

The binomial distribution is a special case of the Poisson binomial distribution, or general binomial distribution, which is the distribution of a sum of n independent non-identical Bernoulli trials B(p_i).
Normal approximation

Binomial probability mass function and normal probability density function approximation for n = 6 and p = 0.5

If n is large enough, then the skew of the distribution is not too great. In this case a reasonable approximation to B(n, p) is given by the normal distribution

${\mathcal {N}}(np,\,{\sqrt {np(1-p)}}),$

and this basic approximation can be improved using a suitable continuity correction. The basic approximation generally improves as n increases (at least 20) and is better when p is not near to 0 or 1.

An interesting aspect as for the Binomial derivations are the Limiting distributions:

Poisson limit theorem: As n approaches ∞ and p approaches 0, then the Binomial(n, p) distribution approaches the Poisson distribution with expected value λ = np.
De Moivre–Laplace theorem: As n approaches ∞ while p remains fixed,

{\frac {X-np}{\sqrt {np(1-p)}}}

the distribution of approaches the normal distribution with expected value 0 and variance 1.
This result is sometimes loosely stated by saying that the distribution of X is asymptotically normal with expected value np and variance np(1 − p).

This result is a specific case of the central limit theorem, about which we’ll talk in the next research.

researches

Rambling things

Research VIII) Binomial distribution and its relatives

Sums of binomials

Bernoulli distribution

Poisson binomial distribution

Normal approximation

Lascia un commento Cancella risposta