Processing math: 28%
+ - 0:00:00
Notes for current slide
Notes for next slide

These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See here for the PDF .


Press the right arrow to progress to the next slide!

1/26

ETC5521: Exploratory Data Analysis


Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 11 - Session 1


1/26

Revisiting
hypothesis testing

2/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
  • So how would I test if a coin is biased or unbiased?
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
  • So how would I test if a coin is biased or unbiased?
  • We'll collect some data.
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
  • So how would I test if a coin is biased or unbiased?
  • We'll collect some data.
  • Experiment 1: I flipped the coin 10 times and this is the result:
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
  • So how would I test if a coin is biased or unbiased?
  • We'll collect some data.
  • Experiment 1: I flipped the coin 10 times and this is the result:
  • The result is 7 head and 3 tails. So 70% are heads.
3/26

Testing coin bias Part 1/2

  • Suppose I have a coin that I'm going to flip
  • If the coin is unbiased, what is the probability it will show heads?
  • Yup, the probability should be 0.5.
  • So how would I test if a coin is biased or unbiased?
  • We'll collect some data.
  • Experiment 1: I flipped the coin 10 times and this is the result:
  • The result is 7 head and 3 tails. So 70% are heads.
  • Do you believe the coin is biased based on this data?
3/26

Testing coin bias Part 2/2

  • Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

4/26

Testing coin bias Part 2/2

  • Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

  • We observe 70 heads and 30 tails. So again 70% are heads.
4/26

Testing coin bias Part 2/2

  • Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

  • We observe 70 heads and 30 tails. So again 70% are heads.
  • Based on this data, do you think the coin is biased?
4/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
5/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
Assumptions Each toss is independent with equal chance of getting a head.
5/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
Assumptions Each toss is independent with equal chance of getting a head.
Test statistic X∼B(n,p). Recall E(X)=np.
The observed test statistic is denoted x.
5/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
Assumptions Each toss is independent with equal chance of getting a head.
Test statistic X∼B(n,p). Recall E(X)=np.
The observed test statistic is denoted x.
P-value
(or critical value or confidence interval)
P(∣Xβˆ’np∣β‰₯∣xβˆ’np∣)
5/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
Assumptions Each toss is independent with equal chance of getting a head.
Test statistic X∼B(n,p). Recall E(X)=np.
The observed test statistic is denoted x.
P-value
(or critical value or confidence interval)
P(∣Xβˆ’np∣β‰₯∣xβˆ’np∣)
Conclusion Reject null hypothesis when the p-value is less than
some significance level Ξ±. Usually Ξ±=0.05.
5/26

(Frequentist) hypotheses testing framework

  • Suppose X is the number of heads out of n independent tosses.
  • Let p be the probability of getting a head for this coin.
Hypotheses H0:p=0.5 vs. H1:p≠0.5
Assumptions Each toss is independent with equal chance of getting a head.
Test statistic X∼B(n,p). Recall E(X)=np.
The observed test statistic is denoted x.
P-value
(or critical value or confidence interval)
P(∣Xβˆ’np∣β‰₯∣xβˆ’np∣)
Conclusion Reject null hypothesis when the p-value is less than
some significance level Ξ±. Usually Ξ±=0.05.
  • The p-value for experiment 1 is P(|Xβˆ’5|β‰₯2)β‰ˆ0.34.
  • The p-value for experiment 2 is P(|Xβˆ’50|β‰₯20)β‰ˆ0.00008.
5/26

Judicial system



6/26

Judicial system



  • Evidence by test statistic
  • Judgement by p-value, critical value or confidence interval
6/26

Judicial system



  • Evidence by test statistic
  • Judgement by p-value, critical value or confidence interval

Does the test statistic have to be a numerical summary statistics?

6/26

Visual inference

7/26

Visual inference

  • Hypothesis testing in visual inference framework is where:
    • the test statistic is a plot and
    • judgement is by human perceptions.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Visual inference

  • Hypothesis testing in visual inference framework is where:

    • the test statistic is a plot and
    • judgement is by human perceptions.
  • You (and many other people) actually do visual inference many times but generally in an informal fashion.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Visual inference

  • Hypothesis testing in visual inference framework is where:

    • the test statistic is a plot and
    • judgement is by human perceptions.
  • You (and many other people) actually do visual inference many times but generally in an informal fashion.

  • Here, we are making an inference on whether the residual plot has any patterns based on a single data plot.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Data plots tend to be over-interpreted


Reading data plots require calibration

9/26

Visual inference more formally

  1. State your null and alternate hypotheses.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
  4. V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
  4. V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
  5. V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
  4. V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
  5. V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots.
  6. A lineup displays these m plots in a random order.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
  4. V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
  5. V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots.
  6. A lineup displays these m plots in a random order.
  7. Ask n human viewers to select a plot in the lineup that looks different to others without any context given.
10/26

Visual inference more formally

  1. State your null and alternate hypotheses.
  2. Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
  3. Define a method to generate null data, \boldsymbol{y}_0.
  4. V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
  5. V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots.
  6. A lineup displays these m plots in a random order.
  7. Ask n human viewers to select a plot in the lineup that looks different to others without any context given.

Suppose x out of n people detected the data plot from a lineup, then

  • the visual inference p-value is given as P(X \geq x) where X \sim B(n, 1/m), and
  • the power of a lineup is estimated as x/n.
10/26

Lineup 1 In which plot is the pink group higher than the blue group?

  • Note: there is no correct answer here.

11/26

Visual inference p-value (or "see"-value)

  • So    x    out of    n    of you chose the data plot.
  • So the visual inference p-value is P(X \geq x) where X \sim B(n, 1/10).
  • In R, this is
    1 - pbinom(x - 1, n, 1/10)
    # OR
    nullabor::pvisual(x, n, 10)
12/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx")
skimr::skim(df)
## ── Data Summary ────────────────────────
## Values
## Name df
## Number of rows 24
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 12, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 4.92 1.25 3 4 5 6 7 β–ƒβ–‡β–‡β–†β–ƒ
## 2 wl2 0 1 3.62 1.24 2 3 3.5 4.25 6 β–†β–‡β–‡β–…β–‚
## 3 wl3 0 1 2.17 1.13 1 1 2 3 4 ▇▅▁▅▃
## 4 se1 0 1 14.8 2.11 11 13 15 16.2 19 ▃▆▂▇▁
## 5 se2 0 1 14.0 2.35 11 11.8 14 15.2 19 β–‡β–‡β–†β–…β–‚
## 6 se3 0 1 15.6 2.45 11 14 15 18 19 β–ƒβ–ƒβ–†β–ƒβ–‡
gweight <- ggplot(df, aes(group, wl1, color = group)) +
ggbeeswarm::geom_quasirandom() +
labs(x = "", y = "Weight loss at 1 month") +
theme(text = element_text(size = 22)) +
guides(color = "none") +
scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight
  • Is weight loss greater with diet after 1 month?
13/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx")
skimr::skim(df)
## ── Data Summary ────────────────────────
## Values
## Name df
## Number of rows 24
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 12, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 4.92 1.25 3 4 5 6 7 β–ƒβ–‡β–‡β–†β–ƒ
## 2 wl2 0 1 3.62 1.24 2 3 3.5 4.25 6 β–†β–‡β–‡β–…β–‚
## 3 wl3 0 1 2.17 1.13 1 1 2 3 4 ▇▅▁▅▃
## 4 se1 0 1 14.8 2.11 11 13 15 16.2 19 ▃▆▂▇▁
## 5 se2 0 1 14.0 2.35 11 11.8 14 15.2 19 β–‡β–‡β–†β–…β–‚
## 6 se3 0 1 15.6 2.45 11 14 15 18 19 β–ƒβ–ƒβ–†β–ƒβ–‡
gweight <- ggplot(df, aes(group, wl1, color = group)) +
ggbeeswarm::geom_quasirandom() +
labs(x = "", y = "Weight loss at 1 month") +
theme(text = element_text(size = 22)) +
guides(color = "none") +
scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight
  • Is weight loss greater with diet after 1 month?
Group N Mean Std. Dev
Control 12 4.50 1.00
Diet 12 5.33 1.37
13/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx")
skimr::skim(df)
## ── Data Summary ────────────────────────
## Values
## Name df
## Number of rows 24
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 12, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 4.92 1.25 3 4 5 6 7 β–ƒβ–‡β–‡β–†β–ƒ
## 2 wl2 0 1 3.62 1.24 2 3 3.5 4.25 6 β–†β–‡β–‡β–…β–‚
## 3 wl3 0 1 2.17 1.13 1 1 2 3 4 ▇▅▁▅▃
## 4 se1 0 1 14.8 2.11 11 13 15 16.2 19 ▃▆▂▇▁
## 5 se2 0 1 14.0 2.35 11 11.8 14 15.2 19 β–‡β–‡β–†β–…β–‚
## 6 se3 0 1 15.6 2.45 11 14 15 18 19 β–ƒβ–ƒβ–†β–ƒβ–‡
gweight <- ggplot(df, aes(group, wl1, color = group)) +
ggbeeswarm::geom_quasirandom() +
labs(x = "", y = "Weight loss at 1 month") +
theme(text = element_text(size = 22)) +
guides(color = "none") +
scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight
  • Is weight loss greater with diet after 1 month?
Group N Mean Std. Dev
Control 12 4.50 1.00
Diet 12 5.33 1.37
with(df,
t.test(wl1[group=="Diet"], wl1[group=="Control"],
alternative = "greater"))
##
## Welch Two Sample t-test
##
## data: wl1[group == "Diet"] and wl1[group == "Control"]
## t = 1.7014, df = 20.125, p-value = 0.05213
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## -0.01117097 Inf
## sample estimates:
## mean of x mean of y
## 5.333333 4.500000
13/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.
14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

    • we could assume a parametric distribution of the data and estimate the parameters from the data, or
14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

    • we could assume a parametric distribution of the data and estimate the parameters from the data, or

    • we could permute the labels for the diet and control group.

14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

    • we could assume a parametric distribution of the data and estimate the parameters from the data, or

    • we could permute the labels for the diet and control group.

14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

    • we could assume a parametric distribution of the data and estimate the parameters from the data, or

    • we could permute the labels for the diet and control group.

14/26

Null data generation method

  • We are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

  • There are a number of ways to generate null data under H_0, e.g.

    • we could assume a parametric distribution of the data and estimate the parameters from the data, or

    • we could permute the labels for the diet and control group.

14/26

Lineup 2 In which plot is the pink group higher than the blue group?

15/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)
## ── Data Summary ────────────────────────
## Values
## Name df2
## Number of rows 22
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 10, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 5.27 1.86 3 4 5 6 9 β–‡β–ƒβ–ƒβ–‚β–‚
## 2 wl2 0 1 4.59 1.87 2 3 4.5 5.75 9 ▇▅▇▃▁
## 3 wl3 0 1 2.14 1.17 1 1 2 3 4 ▇▅▁▃▃
## 4 se1 0 1 15.0 1.65 11 14 15 16 17 ▁▃▂▃▇
## 5 se2 0 1 13.9 1.93 11 12.2 13.5 15 18 β–‡β–‡β–‡β–ƒβ–‚
## 6 se3 0 1 16.2 2.22 11 15 17 18 19 β–‚β–‚β–ƒβ–‡β–‡
gweight %+% df2 +
aes(y = wl2) +
labs(y = "Weight loss at 2 months")
  • Is weight loss greater with diet and exercise after 2 months?
16/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)
## ── Data Summary ────────────────────────
## Values
## Name df2
## Number of rows 22
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 10, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 5.27 1.86 3 4 5 6 9 β–‡β–ƒβ–ƒβ–‚β–‚
## 2 wl2 0 1 4.59 1.87 2 3 4.5 5.75 9 ▇▅▇▃▁
## 3 wl3 0 1 2.14 1.17 1 1 2 3 4 ▇▅▁▃▃
## 4 se1 0 1 15.0 1.65 11 14 15 16 17 ▁▃▂▃▇
## 5 se2 0 1 13.9 1.93 11 12.2 13.5 15 18 β–‡β–‡β–‡β–ƒβ–‚
## 6 se3 0 1 16.2 2.22 11 15 17 18 19 β–‚β–‚β–ƒβ–‡β–‡
gweight %+% df2 +
aes(y = wl2) +
labs(y = "Weight loss at 2 months")
  • Is weight loss greater with diet and exercise after 2 months?
Group N Mean Std. Dev
Control 12 3.33 1.07
DietEx 10 6.10 1.45
16/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)
## ── Data Summary ────────────────────────
## Values
## Name df2
## Number of rows 22
## Number of columns 7
## _______________________
## Column type frequency:
## factor 1
## numeric 6
## ________________________
## Group variables None
##
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate ordered n_unique top_counts
## 1 group 0 1 FALSE 2 Con: 12, Die: 10, Die: 0
##
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
## 1 wl1 0 1 5.27 1.86 3 4 5 6 9 β–‡β–ƒβ–ƒβ–‚β–‚
## 2 wl2 0 1 4.59 1.87 2 3 4.5 5.75 9 ▇▅▇▃▁
## 3 wl3 0 1 2.14 1.17 1 1 2 3 4 ▇▅▁▃▃
## 4 se1 0 1 15.0 1.65 11 14 15 16 17 ▁▃▂▃▇
## 5 se2 0 1 13.9 1.93 11 12.2 13.5 15 18 β–‡β–‡β–‡β–ƒβ–‚
## 6 se3 0 1 16.2 2.22 11 15 17 18 19 β–‚β–‚β–ƒβ–‡β–‡
gweight %+% df2 +
aes(y = wl2) +
labs(y = "Weight loss at 2 months")
  • Is weight loss greater with diet and exercise after 2 months?
Group N Mean Std. Dev
Control 12 3.33 1.07
DietEx 10 6.10 1.45
with(df2,
t.test(wl2[group=="DietEx"], wl2[group=="Control"],
alternative = "greater"))
##
## Welch Two Sample t-test
##
## data: wl2[group == "DietEx"] and wl2[group == "Control"]
## t = 5.0018, df = 16.317, p-value = 6.155e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 1.802104 Inf
## sample estimates:
## mean of x mean of y
## 6.100000 3.333333
16/26

What about if we change the visual test statistic?

17/26

Lineup 3 In which plot is the pink group higher than the blue group?

geom_point() Data plot is Plot 3

18/26

Lineup 4 In which plot is the pink group higher than the blue group?

geom_boxplot() Data plot is Plot 2

19/26

Lineup 5 In which plot is the pink group higher than the blue group?

geom_violin() Data plot is Plot 4

20/26

Lineup 6 In which plot is the pink group higher than the blue group?

ggbeeswarm::geom_quasirandom() Data plot is Plot 10

21/26

Case study 1 Weight loss by exercise

df3 <- filter(WeightLoss, group != "Control")
gweight %+% df3 +
aes(y = wl2) +
labs(y = "Weight loss at 2 months")
  • Is weight loss greater with exercise after 2 months?
Group N Mean Std. Dev
Diet 12 3.92 1.38
DietEx 10 6.10 1.45
with(df3,
t.test(wl2[group=="DietEx"], wl2[group=="Diet"],
alternative = "greater"))
##
## Welch Two Sample t-test
##
## data: wl2[group == "DietEx"] and wl2[group == "Diet"]
## t = 3.5969, df = 18.901, p-value = 0.0009675
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 1.133454 Inf
## sample estimates:
## mean of x mean of y
## 6.100000 3.916667
22/26

Power of a lineup

  • The power of a lineup is calculated as x/n where x is the number of people who detected the data plot out of n people


Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. β€œGraphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

23/26

Power of a lineup

  • The power of a lineup is calculated as x/n where x is the number of people who detected the data plot out of n people


Plot type x n Power
geom_point x_1 n_1 x_1 / n_1
geom_boxplot x_2 n_2 x_2 / n_2
geom_violin x_3 n_3 x_3 / n_3
ggbeeswarm::geom_quasirandom x_4 n_4 x_4 / n_4


Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. β€œGraphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

23/26

Power of a lineup

  • The power of a lineup is calculated as x/n where x is the number of people who detected the data plot out of n people


Plot type x n Power
geom_point x_1 n_1 x_1 / n_1
geom_boxplot x_2 n_2 x_2 / n_2
geom_violin x_3 n_3 x_3 / n_3
ggbeeswarm::geom_quasirandom x_4 n_4 x_4 / n_4


Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. β€œGraphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

  • The plot type with a higher power is preferable
23/26

Power of a lineup

  • The power of a lineup is calculated as x/n where x is the number of people who detected the data plot out of n people


Plot type x n Power
geom_point x_1 n_1 x_1 / n_1
geom_boxplot x_2 n_2 x_2 / n_2
geom_violin x_3 n_3 x_3 / n_3
ggbeeswarm::geom_quasirandom x_4 n_4 x_4 / n_4


Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. β€œGraphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

  • The plot type with a higher power is preferable

  • You can use this framework to find the optimal plot design

23/26

Some considerations in visual inference

  • In practice you don't want to bias the judgement of the human viewers so for a proper visual inference:
    • you should not show the data plot before the lineup
    • you should not give the context of the data
    • you should remove labels in plots
  • You can crowd source these by paying for services like:
  • If the data is for research purposes, then you may need ethics approval for publication.
24/26

Resources and Acknowledgement

  • Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F. Swayne, and Hadley Wickham. 2009. β€œStatistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 367 (1906): 4361–83.
  • Wickham, Hadley, Dianne Cook, Heike Hofmann, and Andreas Buja. 2010. β€œGraphical Inference for Infovis.” IEEE Transactions on Visualization and Computer Graphics 16 (6): 973–79.
  • Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. β€œGraphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.
  • Majumder, M., Heiki Hofmann, and Dianne Cook. 2013. β€œValidation of Visual Statistical Inference, Applied to Linear Models.” Journal of the American Statistical Association 108 (503): 942–56.
  • Data coding using tidyverse suite of R packages
  • Slides constructed with xaringan, remark.js, knitr, and R Markdown.
25/26

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 11 - Session 1


26/26

ETC5521: Exploratory Data Analysis


Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 11 - Session 1


1/26
Paused

Help

Keyboard shortcuts

↑, ←, Pg Up, k Go to previous slide
↓, β†’, Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow