
STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.
This lecture was partially adapted from the previous STAT1003 lecturers. Thank you folks!
Estimation (last topic)
Draw inferences about a population by estimating population parameters from a sample (point and interval estimators).
Hypothesis testing (this topic)
Draw inferences about a population by making a claim or hypothesis about a population parameter and testing whether the hypothesis is supported by the sample.


Two types of errors can occur in this process:
Determine the null and alternative hypotheses based on the research question.
The packaging on a light bulb states that the bulb will last 500 hours under normal use. A consumer advocate would like to know if the mean lifetime of a bulb is less than 500 hours.
According to an Australian Bureau of Statistics survey conducted in 2010, 77% of Australian adults were mostly or extremely satisfied with their life as a whole. A researcher wonders if the percentage of satisfied Australians is the same today.
A sports dietician would like to see whether the intake of caffeine prior to a race can improve the performance of ultra marathon runners.
Suppose I have a coin that I’m going to flip 
I have some suspicion that the coin has been tweaked to be biased towards heads.
Let \(p\) be the probability of getting a head.
If the coin is biased towards heads, then
So how would we test if a coin is biased towards head or not?
We’ll collect some data.
I flipped the coin 10 times and this is the result:










Suppose now I flip the coin 200 times and this is the outcome:








































































































































































































\(H_0: p = 0.5\) vs. \(H_A: p > 0.5\)
We observed \(x = 7\) heads out of \(n = 10\) tosses.
The p-value is \(P(X \geq x) = 1 - P(X \leq x - 1) = 1 - P(X \leq 6)\).
We observed \(x = 140\) heads out of \(n = 200\) tosses.
The p-value is \(P(X \geq x) = 1 - P(X \leq x - 1) = 1 - P(X \leq 139)\).

\(H_0: p = 0.5\) vs. \(H_A: p \neq 0.5\)
We observed \(x = 7\) heads out of \(n = 10\) tosses.
The p-value is \(P(|X - E(X)| \geq |x - E(X)|) = P(|X - 5| \geq 2) = P(X \leq 3) + P(X \geq 7)\).
We observed \(x = 140\) heads out of \(n = 200\) tosses.
The p-value is \(P(|X - 100| \geq 40) = P(X \leq 60) + P(X \geq 140)\).
\(H_0: p = 0.5\) vs. \(H_A: p \neq 0.5\)
Note: if performing a one-sided test, the directon should have been specified in advance of the experiment.
Upper-tail test: \(H_0: p = 0.5\) vs. \(H_A: p > 0.5\)
Lower-tail test: \(H_0: p = 0.5\) vs. \(H_A: p < 0.5\)
The ASA statement on p-values highlights the following six principles:
In light of misuses of and misconceptions concerning p-values, the statement notes that statisticians often supplement or even replace p-values with other approaches. These include methods “that emphasize estimation over testing such as confidence … intervals”
— ASA Statement on p-values
If \(H_0: p = p_0\) vs \(H_A: p \neq p_0\) and the \(100(1-\alpha)\)% confidence interval contains \(p_0\), then we fail to reject the null hypothesis at \(\alpha\) significance level.
Hypothesis testing is a systematic way to evaluate evidence against a null hypothesis. It involves:
Binomial test \(H_0: p = p_0\)
I am 160 cm tall.
Am I significantly shorter than the average adult woman in Australia?


\(H_0: \mu = \mu_0\) vs. \(H_A: \mu \neq \mu_0\) or \(H_A: \mu > \mu_0\) or \(H_A: \mu < \mu_0\)
We observe \(n\) samples from a population with mean \(\mu\) and standard deviation \(\sigma\).
The rejection region is the set of values of the test statistic that leads to rejection of \(H_0\).
| Alternative Hypothesis | Rejection Region for \(\sigma\) known | Rejection Region for \(\sigma\) unknown |
|---|---|---|
| \(H_A: \mu > \mu_0\) | \((z^*_{\alpha}, \infty)\) | \((t^*_{n - 1, \alpha}, \infty)\) |
| \(H_A: \mu < \mu_0\) | \((-\infty, z^*_{\alpha})\) | \((-\infty, t^*_{n - 1, \alpha})\) |
| \(H_A: \mu \neq \mu_0\) | \((-|z^*_{\alpha/2}|, |z^*_{\alpha/2}|)\) | \((-|t^*_{n - 1, \alpha/2}|, |t^*_{n - 1, \alpha/2}|)\) |
where the critical values are defined as:
I am 160 cm tall. Am I significantly shorter than the average adult woman in Australia?
\(H_0: \mu = 160\) (I’m average height) vs. \(H_A: \mu > 160\) (I am shorter than the population average)
\(H_0: \mu_1 = \mu_2\) vs. \(H_A: \mu_1 \neq \mu_2\)
The power of a test is defined as \(1 - \beta\), which is the probability of correctly rejecting \(H_0\) when \(H_A\) is true.
The population adult woman mean height is 165 cm and population standard deviation is 15 cm. What is the probability that we will make a Type II Error if we collect a new sample of size 35 and conduct hypothesis testing with significance level 0.05?
Hypothesis testing for a single population mean:
| \(H_A\) | P-value | Confidence Interval | Rejection Region |
|---|---|---|---|
| \(\mu > \mu_0\) | \(P(Z \geq z^*)\) or \(P(T \geq t^*)\) | - | \((z^*_{\alpha}, \infty)\) or \((t^*_{n - 1, \alpha}, \infty)\) |
| \(\mu < \mu_0\) | \(P(Z \leq z^*)\) or \(P(T \leq t^*)\) | - | \((-\infty, z^*_{\alpha})\) or \((-\infty, t^*_{n - 1, \alpha})\) |
| \(\mu \neq \mu_0\) | \(P(|Z| \geq |z^*|)\) or \(P(|T| \geq |t^*|)\) | \(\bar{x} \pm z^*_{\alpha/2} \frac{\sigma}{\sqrt{n}}\) or \(\bar{x} \pm t^*_{n-1, \alpha/2} \frac{s}{\sqrt{n}}\) | \((-|z^*_{\alpha/2}|, |z^*_{\alpha/2}|)\) or \((-|t^*_{n - 1, \alpha/2}|, |t^*_{n - 1, \alpha/2}|)\) |
Always interpret the results of hypothesis testing in the context of the data and the research question.

STAT1003 – Statistical Techniques