
STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.




There are a number of special continuous distributions like:
Uniform distribution
Normal distribution \(N(0, 1)\)
t distribution \(N(0, 1)\)
F distribution
A continuous random variable \(X\) is said to have a uniform distribution over the interval \([a,b]\) if its pdf is given by \[ f_X(x)=\left\{\begin{array}{ll} \dfrac{1}{b-a}, & a \leq x\leq b \\ 0, & x < a \text{ or } x>b \end{array}\right. \]
We use the notation \(X\sim U(a,b)\) and
\(E(X) = \frac{a+b}{2}\)
\(\text{Var}(X) = \frac{(b-a)^2}{12}\)
A continuous random variable \(X\) has a normal distribution if its pdf is: \[f(x; \mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}}\text{exp}\left(-\frac{(x - \mu)^2}{2\sigma^2}\right)\]
Written as \(X \sim N(\mu, \sigma^2)\) where
The Z-score is defined as the number of standard deviations a value is from the mean, i.e. \[Z = \frac{X - \mu}{\sigma}.\]
Z-scores are used to standardise values from different normal distributions, allowing us to compare them on a common scale.

\(z_\text{Ann} = \frac{1300 - 1100}{200} = 1\) and \(z_\text{Tom} = \frac{24 - 21}{6} \approx 0.5\).
\(z_\text{Ann} > z_\text{Tom}\), so Ann performed better relative to her peers than Tom!
\(f_X(x) = \frac{1}{\sigma\sqrt{2\pi}}\text{exp}\left(-\frac{(x - \mu)^2}{2\sigma^2}\right)\) where \(X \sim N(\mu, \sigma^2)\)
\(F_X(x) = \int_{-\infty}^{x} \frac{1}{\sigma\sqrt{2\pi}}\text{exp}\left(-\frac{(t - \mu)^2}{2\sigma^2}\right) \, dt\) where \(X \sim N(\mu, \sigma^2)\)
Find \(q\) such that \(P(X < q) = p\) where \(X \sim N(\mu, \sigma^2)\)
Simulate draws from \(X \sim N(\mu, \sigma^2)\)
Recall: the probability of a continuous random variable falling in a specific range is the area under the curve.

\(P(X < 2)\) where \(X \sim N(0, 1)\)

\(P(X > 2)\) where \(X \sim N(1, 2)\)

\(P(0 < X < 2)\) where \(X \sim N(0, 1)\)
Suppose that you are given the following R output:
[1] 0.5000000 0.6914625 0.8413447
[4] 0.9331928 0.9772499 0.9937903
[7] 0.9986501
Using the above information only, calculate the following probabilities:

E.g. the average adult height is ~167cm, and the standard deviation is ~10cm based on US National Health and Nutrition Examination Survey (2017-2018) data. So assuming the distribution of adult heights is normal, approximately 99.7% of adults have a height between 137cm and 197cm.
scroll
The t-distribution is implemented in R using the dt(), pt(), qt(), and rt() functions, which are analogous to the dnorm(), pnorm(), qnorm(), and rnorm() functions for the normal distribution.
When the sample size is large, there is not much difference:
| Distribution | Support | Mean | Variance |
|---|---|---|---|
| \(X \sim U(a, b)\) | \([a, b]\) | \(\dfrac{a + b}{2}\) | \(\dfrac{(b - a)^2}{12}\) |
| \(X \sim N(\mu, \sigma^2)\) | \((-\infty, \infty)\) | \(\mu\) | \(\sigma^2\) |
| \(X \sim t(\nu)\) | \((-\infty, \infty)\) | \(0\) (for \(\nu > 1\)) | \(\frac{\nu}{\nu - 2}\) (for \(\nu > 2\)) |
| Distribution | cdf | quantile function | random generation | |
|---|---|---|---|---|
| Uniform | dunif(x, a, b) |
punif(q, a, b) |
qunif(p, a, b) |
runif(n, a, b) |
| Normal | dnorm(x, mean, sd) |
pnorm(q, mean, sd) |
qnorm(p, mean, sd) |
rnorm(n, mean, sd) |
| t-distribution | dt(x, df) |
pt(q, df) |
qt(p, df) |
rt(n, df) |

STAT1003 – Statistical Techniques