Processing math: 9%
+ - 0:00:00
Notes for current slide
Notes for next slide

These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See here for the PDF .


Press the right arrow to progress to the next slide!

1/16

ETC5521: Exploratory Data Analysis


Extending beyond the data, what can and cannot be inferred more generally, given the data collection

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 12 - Session 2


1/16

Sample size calculation

2/16

How many people should you survey?

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = rgamma(n(), shape = 3, rate = 4))
set.seed(1)
g <- lineup(null_dist("y", dist = "exp", params = list(rate = 1/mean(df$y))), true = df, n = 20, pos = 15) %>%
ggplot(aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
g
  • Here we are testing H0:Yexp(λ).
3/16

How many people should you survey?

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = rgamma(n(), shape = 3, rate = 4))
set.seed(1)
g <- lineup(null_dist("y", dist = "exp", params = list(rate = 1/mean(df$y))), true = df, n = 20, pos = 15) %>%
ggplot(aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
g
  • Here we are testing H0:Yexp(λ).
  • Suppose we only have one person to assess the lineup.
3/16

How many people should you survey?

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = rgamma(n(), shape = 3, rate = 4))
set.seed(1)
g <- lineup(null_dist("y", dist = "exp", params = list(rate = 1/mean(df$y))), true = df, n = 20, pos = 15) %>%
ggplot(aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
g
  • Here we are testing H0:Yexp(λ).
  • Suppose we only have one person to assess the lineup.
  • If there is only a single response, then there are only two scenarios possible:
    • Scenario 1: the person detects the data plot
    • Scenario 2: the person does not detect the data plot
3/16

How many people should you survey?

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = rgamma(n(), shape = 3, rate = 4))
set.seed(1)
g <- lineup(null_dist("y", dist = "exp", params = list(rate = 1/mean(df$y))), true = df, n = 20, pos = 15) %>%
ggplot(aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
g
  • Here we are testing H0:Yexp(λ).
  • Suppose we only have one person to assess the lineup.
  • If there is only a single response, then there are only two scenarios possible:
    • Scenario 1: the person detects the data plot
    • Scenario 2: the person does not detect the data plot
  • The visual inference p-value under:
    • Scenario 1 is 0.05
    • Scenario 2 is 1
3/16

How many people should you survey?

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = rgamma(n(), shape = 3, rate = 4))
set.seed(1)
g <- lineup(null_dist("y", dist = "exp", params = list(rate = 1/mean(df$y))), true = df, n = 20, pos = 15) %>%
ggplot(aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
g
  • Here we are testing H0:Yexp(λ).
  • Suppose we only have one person to assess the lineup.
  • If there is only a single response, then there are only two scenarios possible:
    • Scenario 1: the person detects the data plot
    • Scenario 2: the person does not detect the data plot
  • The visual inference p-value under:
    • Scenario 1 is 0.05
    • Scenario 2 is 1
  • Neither scenario yield p-values < 0.05!
3/16

Power of a binary hypothesis test

The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true.

  • Since m2, i.e. under H0, 0<p=1/m0.5.
  • Recall visual inference p-value is P(X \geq x) = \sum_{k = x}^n {n\choose k} (1/m)^k(1 - 1/m)^{n-k}.
  • So for m = 20 and n = 10,

4/16

Power of a binary hypothesis test

The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true.

  • Since m \geq 2, i.e. under H_0, 0 < p = 1/m \leq 0.5.
  • Recall visual inference p-value is P(X \geq x) = \sum_{k = x}^n {n\choose k} (1/m)^k(1 - 1/m)^{n-k}.
  • So for m = 20 and n = 10,

  • So if we have X > 2, then p-value < 0.05.
4/16

Power of a binary hypothesis test

The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true.

  • Since m \geq 2, i.e. under H_0, 0 < p = 1/m \leq 0.5.
  • Recall visual inference p-value is P(X \geq x) = \sum_{k = x}^n {n\choose k} (1/m)^k(1 - 1/m)^{n-k}.
  • So for m = 20 and n = 10,

  • So if we have X > 2, then p-value < 0.05.
  • Suppose then the true detection probability is 0.9, therefore H_1 is true.
4/16

Power of a binary hypothesis test

The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true.

  • Since m \geq 2, i.e. under H_0, 0 < p = 1/m \leq 0.5.
  • Recall visual inference p-value is P(X \geq x) = \sum_{k = x}^n {n\choose k} (1/m)^k(1 - 1/m)^{n-k}.
  • So for m = 20 and n = 10,

  • So if we have X > 2, then p-value < 0.05.
  • Suppose then the true detection probability is 0.9, therefore H_1 is true.
  • Under p = 0.9, P(X > 2) = \sum_{k = 3}^{10} 0.9^k0.1^{(10 - k)} = 0.9999996
4/16

Power of a binary hypothesis test

The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true.

  • Since m \geq 2, i.e. under H_0, 0 < p = 1/m \leq 0.5.
  • Recall visual inference p-value is P(X \geq x) = \sum_{k = x}^n {n\choose k} (1/m)^k(1 - 1/m)^{n-k}.
  • So for m = 20 and n = 10,

  • So if we have X > 2, then p-value < 0.05.
  • Suppose then the true detection probability is 0.9, therefore H_1 is true.
  • Under p = 0.9, P(X > 2) = \sum_{k = 3}^{10} 0.9^k0.1^{(10 - k)} = 0.9999996
  • Therefore the power of the test is 0.9999996 if p = 0.9.
4/16

Power analysis

  • Let's suppose H_1 is true and that specifically p = 0.9.
  • Let's fix m = 20 and reject H_0 if p-value < \alpha = 0.05.
5/16

Estimating the detection probability p

  • For a fixed power (1-\beta), the minimum sample size n we need depends on the detection probability p
6/16

Estimating the detection probability p

  • For a fixed power (1-\beta), the minimum sample size n we need depends on the detection probability p

  • Generally if p is larger, less n is sufficient to get equivalent or larger power.

6/16

Estimating the detection probability p

  • For a fixed power (1-\beta), the minimum sample size n we need depends on the detection probability p

  • Generally if p is larger, less n is sufficient to get equivalent or larger power.

  • But we don't know what the true p is! (If we did, we don't need to test for it.)

6/16

Estimating the detection probability p

  • For a fixed power (1-\beta), the minimum sample size n we need depends on the detection probability p

  • Generally if p is larger, less n is sufficient to get equivalent or larger power.

  • But we don't know what the true p is! (If we did, we don't need to test for it.)

  • Either you will need to make a guess from past experience or run a pilot test.

6/16

Estimating the detection probability p

  • For a fixed power (1-\beta), the minimum sample size n we need depends on the detection probability p

  • Generally if p is larger, less n is sufficient to get equivalent or larger power.

  • But we don't know what the true p is! (If we did, we don't need to test for it.)

  • Either you will need to make a guess from past experience or run a pilot test.

  • If you find in the pilot test, x_p out of n_p participants detected the data plot then an estimate of \hat{p} = x_p / n_p.

6/16

Sample size calculation

  • The sample size calculation depends on:
    • the selected false positive rate (\alpha)
    • the detection probability p
    • the number of plots in the lineup m
    • the minimum power (1 - \beta) desired
    • the expected dropout rate d (i.e. proportion of samples that cannot be used due to incomplete results or other quality issues)
p <- 0.1
m <- 20
d <- 0.95
power_df <- tibble(n = 2:200) %>%
mutate(power = map_dbl(n, function(n) {
x <- 1:n
pval <- map_dbl(x, ~1 - pbinom(.x - 1, n, 1/m))
xmin <- x[which.max(pval < alpha)]
1 - pbinom(xmin - 1, n, p)
}))
power_df %>%
filter(power > 0.8) %>%
pull(n) %>%
min() %>%
magrittr::divide_by(d) %>%
ceiling()
## [1] 178
7/16

Sample size calculation

  • The sample size calculation depends on:
    • the selected false positive rate (\alpha)
    • the detection probability p
    • the number of plots in the lineup m
    • the minimum power (1 - \beta) desired
    • the expected dropout rate d (i.e. proportion of samples that cannot be used due to incomplete results or other quality issues)
  • Say if \alpha = 0.05, p = 0.1, m = 20, d=0.95 and at least 80\% power is desired then at least 178 samples is required.
p <- 0.1
m <- 20
d <- 0.95
power_df <- tibble(n = 2:200) %>%
mutate(power = map_dbl(n, function(n) {
x <- 1:n
pval <- map_dbl(x, ~1 - pbinom(.x - 1, n, 1/m))
xmin <- x[which.max(pval < alpha)]
1 - pbinom(xmin - 1, n, p)
}))
power_df %>%
filter(power > 0.8) %>%
pull(n) %>%
min() %>%
magrittr::divide_by(d) %>%
ceiling()
## [1] 178
7/16

Simulating from the null distribution

8/16

Recap: Simulating data from parametric models

  • Recall in lecture 8, we studied how to simulate data from parametric models.
set.seed(1)
df1 <- tibble(id = 1:200) %>%
mutate(x = runif(n(), 0, 5),
y = 2 * x + 1 + rnorm(n()))
ggplot(df1, aes(x, y)) + geom_point()

  • We also briefly discussed how to simulate data from the null distribution in lecture 11.
9/16

Case study 1 Testing for normality

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = runif(n(), -4, 4))
set.seed(1)
ldf <- lineup(null_dist("y", dist = "norm", params = list(mean = mean(df$y), sd = sd(df$y))),
true = df, n = 20, pos = 4)
ggplot(ldf, aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
  • We are testing H_0: Y \sim N(\mu, \sigma^2).
10/16

Case study 1 Testing for normality

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = runif(n(), -4, 4))
set.seed(1)
ldf <- lineup(null_dist("y", dist = "norm", params = list(mean = mean(df$y), sd = sd(df$y))),
true = df, n = 20, pos = 4)
ggplot(ldf, aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
  • We are testing H_0: Y \sim N(\mu, \sigma^2).

  • An estimate of \hat{\mu} = \bar{Y} is estimated the sample mean

  • An estimate of \hat{\sigma} = sd(Y) is estimated the sample standard deviation
10/16

Case study 1 Testing for normality

set.seed(1)
df <- tibble(id = 1:200) %>%
mutate(y = runif(n(), -4, 4))
set.seed(1)
ldf <- lineup(null_dist("y", dist = "norm", params = list(mean = mean(df$y), sd = sd(df$y))),
true = df, n = 20, pos = 4)
ggplot(ldf, aes(y)) +
geom_histogram(color = "white") +
facet_wrap(~.sample) +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks.length = unit(0, "pt"))
  • We are testing H_0: Y \sim N(\mu, \sigma^2).

  • An estimate of \hat{\mu} = \bar{Y} is estimated the sample mean

  • An estimate of \hat{\sigma} = sd(Y) is estimated the sample standard deviation

  • A null data here is simply simulated from N(\hat{\mu}, \hat{\sigma}).

10/16

Case study 2 Testing for a distribution

  • It is easier to compare a distribution using Q-Q plot
11/16

Case study 2 Testing for a distribution

  • It is easier to compare a distribution using Q-Q plot

  • Plot 4 is in indeed the data plot.

  • In fact the data is generated from a uniform distribution.
11/16

Case study 3 Checking if there is a pattern in residual plot

  • In the left lineup, we are testing the data plot to see if there is any pattern.
  • When the null distribution is imprecise, for example in search of a pattern in residual plot, you need to choose a null generation method that mimics an appropriate distribution under the null.
12/16

Selecting an appropriate null generation method

13/16

Mis-specifying the null distribution

  • If the null distribution is mis-specified, this can make the detection probability larger.
  • This however can result in an incorrect conclusion.
14/16

While today's focus was on data collection from visual inference surveys, concepts such as data quality checks and sufficient sample size to draw inference is applicable to other data collection.

15/16

While today's focus was on data collection from visual inference surveys, concepts such as data quality checks and sufficient sample size to draw inference is applicable to other data collection.

There's always more to learn but stay curious and make sure you plot your data before rushing off to fitting some models!

15/16

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 12 - Session 2


16/16

ETC5521: Exploratory Data Analysis


Extending beyond the data, what can and cannot be inferred more generally, given the data collection

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 12 - Session 2


1/16
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow