Discrete random variables

STAT1003 – Statistical Techniques

Dr. Emi Tanaka

Australian National University

These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.

Probability mass function

Suppose that \(X\) is a discrete random variable with \(k\) distinct possible values: \(x_1, x_2, \ldots, x_k\).
The pmf, denoted as \(p_X(x) = P(X = x)\), gives the probability that the random variable \(X\) is exactly equal to some value \(x\).

Properties of a pmf:

\(0 \leq p_X(x) \leq 1\) for all \(x\)
\(\sum_{i = 1}^k p_X(x_i) = 1\)

If \(X\) is the number of heads in 2 tosses of a fair coin, then the pmf is given by:

\(x\)	\(p_X(x)\)
\(0\)	\(0.25\)
\(1\)	\(0.50\)
\(2\)	\(0.25\)

Note: this is similar to a relative frequency table, but the probabilities are theoretical values based on the assumption of a fair coin, rather than empirical estimates from data.

Expected value of a discrete random variables

The expected value (or mean) of a discrete random variable \(X\) is the long-run average value of repetitions of the experiment:

\[E(X) = \mu = \sum_{i=1}^k x_i \, p_X(x_i).\]

The expected value of \(X\), the number of heads in 2 tosses of a fair coin, is:

\[E(X) = 0 \times 0.25 + 1 \times 0.50 + 2 \times 0.25 = 1.\]

Variance of a discrete random variable

The variance measures how spread out the values of \(X\) are around the mean:

\[\text{Var}(X) = \sigma^2 = \sum_{i=1}^k (x_i - \mu)^2 \,p_X(x_i)\]

The variance of \(X\), the number of heads in 2 tosses of a fair coin, is:

\[\begin{align*} \text{Var}(X) &= (0 - 1)^2 \times 0.25 +\\&\qquad (1 - 1)^2 \times 0.50 +\\&\qquad (2 - 1)^2 \times 0.25\\ &= 0.5. \end{align*}\]

Expected value of a function

Sometimes we are interested in a function of a random variable, such as \(Y = g(X)\).

The expected value of a transformation is given by: \[E\left(g(X)\right) = \sum_{i = 1}^k g(x_i) \cdot p_X(x_i)\]

for a discrete random variable \(X\).

If \(X\) is the number of heads in 2 tosses of a fair coin, and we define \(Y = X^2\), then the expected value of \(Y\) is:

\[\begin{align*} E(Y) &= E(X^2)\\ &= 0^2 \times 0.25 + 1^2 \times 0.50 + 2^2 \times 0.25\\ &= 1.5. \end{align*}\]

Bivariate distributions

A bivariate distribution describes the probability behavior of two random variables, say \(X\) and \(Y\), simultaneously.
The joint probability distribution specifies \(P(X = x, Y = y)\) for all possible \((x, y)\) combinations and must satisfy:
- \(0 \le p_{X,Y}(x,y) \le 1\) for all \(x, y\)
- \(\sum_x \sum_y p_{X,Y}(x,y) = 1\)

Joint probability table:

\(X \backslash Y\)	\(1\)	\(2\)
\(0\)	\(0.1\)	\(0.2\)
\(1\)	\(0.3\)	\(0.4\)

Each entry shows \[P(X = x, Y = y).\]
The total of all probabilities in the table must be \(1\).

Marginal distributions

The marginal distribution describes the probability distribution of one variable in a bivariate (joint) distribution, regardless of the value of the other variable.

For random variables \(X\) and \(Y\) with joint probabilities \(p_{X,Y}(x, y)\):
- The marginal distribution of \(X\): \[p_X(x) = \sum_y p_{X,Y}(x, y).\]
- The marginal distribution of \(Y\): \[p_Y(y) = \sum_x p_{X,Y}(x, y).\]

If the joint probability table is:

\(X \backslash Y\)	\(1\)	\(2\)
\(0\)	\(0.1\)	\(0.2\)
\(1\)	\(0.3\)	\(0.4\)

Marginal for \(X = 0\): \(P(X=0) = 0.1 + 0.2 = 0.3\)
Marginal for \(Y = 2\): \(P(Y=2) = 0.2 + 0.4 = 0.6\)

Independence of random variables

Two random variables \(X\) and \(Y\) are independent if knowing the value of one does not provide any information about the other.
Mathematically, \(X\) and \(Y\) are independent if, for all values of \(x\) and \(y\): \[p_{XY}(x,y) = p_X(x) \cdot p_Y(y).\]

Independence is often a simplifying (but realistic) assumption that greatly simplifies analysis.

If the joint probability table is:

\(X \backslash Y\)	\(1\)	\(2\)
\(0\)	\(0.1\)	\(0.2\)
\(1\)	\(0.3\)	\(0.4\)

\(p_{XY}(0, 1) = 0.1\)
\(p_X(0) = 0.1 + 0.2 = 0.3\)
\(p_Y(1) = 0.1 + 0.3 = 0.4\)
So \(p_X(0) \cdot p_Y(1) = 0.3 \cdot 0.4 = 0.12\).
Since \(p_{XY}(0, 1) \neq p_X(0) \cdot p_Y(1)\),
\(X\) and \(Y\) are not independent.

Covariance

The covariance is defined as: \[\mathrm{Cov}(X, Y) = E\left((X - E(X)) (Y - E(Y))\right)\]

Alternatively, \[\mathrm{Cov}(X, Y) = E(XY) - E(X)E(Y)\]

If \(X\) and \(Y\) are independent, then \(\mathrm{Cov}(X, Y) = 0\).
However, \(\mathrm{Cov}(X, Y) = 0\) does not necessarily imply that \(X\) and \(Y\) are independent!

Laws of expected value and variance

For random variables \(X\) and \(Y\), and constants \(a\) and \(b\), the following properties hold:

\(E(a) = a\)
\(E(aX) = aE(X)\)
\(E(aX + bY) = aE(X) + bE(Y)\)
If \(X\) and \(Y\) are independent:
\(E(XY) = E(X)E(Y)\)

\(\mathrm{Var}(a) = 0\)
\(\mathrm{Var}(X) = E(X^2) - E(X)^2\)
\(\mathrm{Var}(aX) = a^2\mathrm{Var}(X)\)
\(\mathrm{Var}(X + a) = \mathrm{Var}(X)\)
\(\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y) + 2ab\mathrm{Cov}(X, Y)\)
If \(X\) and \(Y\) are independent:
\(\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y)\)

Summary

A discrete random variable \(X\) takes on a countable number of distinct values.
The probability mass function (pmf), \(p_X(x)\), gives the probability that a discrete random variable is exactly equal to \(x\).
The expected value \(E(X) = \sum_x x\cdot p_X(x)\).
The variance \(\mathrm{Var}(X) = E\left(X - E(X)\right)^2 = E(X^2) - E(X)^2\).
The expected value of a transformation \(E(g(X)) = \sum_x g(x) \cdot p_X(x)\).
The random variables \(X\) and \(Y\) are independent if \(p_{XY}(x,y) = p_X(x) \cdot p_Y(y)\) for all \(x\) and \(y\).
For random variables \(X\) and \(Y\) and constants \(a\) and \(b\):
- \(E(aX + bY) = aE(X) + bE(Y)\) and
- \(\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y) + 2 a b \mathrm{Cov}(X, Y)\).