
STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.
This lecture was partially adapted from the previous STAT1003 lecturers. Thank you folks!
Let \(\theta\) be some population parameter and let \(\hat{\theta}\) denote a point estimator of \(\theta\).
Consider three different point estimators for \(\mu\):
Error of the estimate is the difference between the estimated value and the parameter, and is composed of:
The formal definition of the bias of a point estimator \(\hat{\theta}\) of a parameter \(\theta\) is
\[\text{Bias}(\hat{\theta}) = E(\hat{\theta}) - \theta\]
A point estimator is unbiased if \(\text{Bias}(\hat{\theta}) = 0\), i.e. if \(\text{E}(\hat{\theta}) = \theta\)
If the bias is positive, the estimator tends to over-estimate the parameter.
If the bias is negative, the estimator tends to under-estimate the parameter.
We have shown before that \(E(\bar{X}) = \mu\), so sample mean \(\bar{X}\) is an unbiased estimator for \(\mu\).
The sample median, \(\tilde{X}\), would:
If we use the average of the first two observations in the sample as an estimator, this would also be an unbiased estimator because
\[ E\left[\frac{1}{2}(X_{1} + X_{2})\right] = \frac{1}{2}\left[E(X_{1}) + E(X_{2})\right] = \frac{1}{2}(\mu + \mu) = \mu. \]
Sampling error, sometimes called sampling uncertainty, describes how much an estimate will tend to vary from one sample to the next.
It is measured by the variance of an estimator.
If \(\hat{\theta}\) is a point estimator of \(\theta\), the variance of \(\hat{\theta}\) is
\[\text{Var}(\hat{\theta}) = E\left((\hat{\theta} - E(\hat{\theta}))^2\right) = E\left(\hat{\theta}^2\right) - \left(E(\hat{\theta})\right)^2\]
\(\text{Var}(\bar{X}) = \dfrac{\sigma^2}{n}\)
The variance decreases when the sample size increases.
We will get a more efficient estimator for the population mean when the sample size is large.
\(\text{Var}(\tilde{X}) \approx \dfrac{\pi\sigma^2}{2n}\) if the population distribution is normal (proof out of scope).
\(\text{Var}\left[\frac{1}{2}(X_1 + X_2)\right] = \frac{1}{4}\left[\text{Var}(X_1) + \text{Var}(X_2)\right] = \frac{1}{4}\left[\sigma^2 + \sigma^2\right] = \dfrac{\sigma^2}{2}\)
The mean squared error (MSE) of a point estimator takes into account both. It is defined as
\[\text{MSE}(\hat{\theta}) = \text{E}\left[(\hat{\theta} - \theta)^2\right] = \left(\text{E}(\hat{\theta}) - \theta\right)^2 + E\left((\hat{\theta} - E(\hat{\theta}))^2\right) = \text{Bias}(\hat{\theta})^2 + \text{Var}(\hat{\theta})\]
A point estimator is said to be consistent if the MSE of the estimator goes to zero as sample size \(n\) increases:
\[\lim_{n \to \infty} \text{MSE}(\hat{\theta}) = 0\]


\[\begin{align*} & P\left(-1.96 < \frac{\bar{X} - \mu}{\sigma/\sqrt{n}} < 1.96\right) \approx 0.95\\ &\quad = P\left(-1.96\frac{\sigma}{\sqrt{n}} < \bar{X} - \mu < 1.96\frac{\sigma}{\sqrt{n}}\right) \\ &\quad = P\left(\bar{X}-1.96\frac{\sigma}{\sqrt{n}} < \mu < \bar{X}+1.96\frac{\sigma}{\sqrt{n}}\right) \\ \end{align*}\]
Therefore, a 95% confidence interval for \(\mu\) is \[\left(\bar{X}-1.96\frac{\sigma}{\sqrt{n}},\;\bar{X}+1.96\frac{\sigma}{\sqrt{n}}\right).\]

\[\left(\bar{X}-z^*_{\alpha/2}\frac{\sigma}{\sqrt{n}},\;\bar{X}+z^*_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right)\]
where \(z^*_{\alpha/2}\) is the critical value such that \[P(Z < z^*_{\alpha/2}) = 1 - \alpha/2\] for \(Z \sim N(0,1)\).
Suppose we repeat the experiment many times and construct a \(100(1 - \alpha)\%\) confidence interval from each sample. We expect that approximately \(100(1 - \alpha)\%\) of those intervals will contain the true parameter value.
\[\left(\bar{X}-z^*_{\alpha/2}\frac{\sigma}{\sqrt{n}},\;\bar{X}+z^*_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right)\]

\[ \frac{\bar{X}-\mu}{s/\sqrt{n}} \sim t_{n-1} \]
\[ \left( \bar{X}-t^*_{n-1,\alpha/2}\frac{s}{\sqrt{n}},\; \bar{X}+t^*_{n-1,\alpha/2}\frac{s}{\sqrt{n}} \right) \]
where \(t^*_{n-1,\alpha/2}\) is the critical value such that \[P(T < t^*_{n-1,\alpha/2}) = 1 - \alpha/2\] for \(T \sim t_{n-1}\).
Does it really make a difference if you use the t-distribution instead of the normal distribution when \(\sigma\) is unknown? Let’s find out by simulating confidence intervals using both methods.
A random sample of 30 households was selected as part of a study on electricity usage, and the number of kilowatt-hours (kWh) was recorded for each household in the sample for the March quarter of 2006. The average usage was found to be 375kWh sample standard deviation is 91.5kWh. Find a 99% confidence interval for the mean usage in the March quarter of 2006.
A \(100(1 - \alpha)\%\) confidence interval for \(p\) is given by
\[\left(\hat{p} - z^*_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}},\;\hat{p} + z^*_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right)\]
A random sample of 100 preschool children in Bruce revealed that only 62 had been vaccinated. Provide an approximate 90% confidence interval for the proportion vaccinated in that suburb.
A \((1 - \alpha)100\%\) confidence interval for the mean \(\mu\) is of the form:
\[\text{Point Estimate} \pm \text{Critical Value} \times \text{Standard Error of Point Estimate}.\]

STAT1003 – Statistical Techniques