These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See here for the PDF .

Press the right arrow to progress to the next slide!

1/26

ETC5521: Exploratory Data Analysis

Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 11 - Session 1

1/26

Revisiting 
 hypothesis testing2/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.
So how would I test if a coin is biased or unbiased?

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.
So how would I test if a coin is biased or unbiased?
We'll collect some data.

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.
So how would I test if a coin is biased or unbiased?
We'll collect some data.
Experiment 1: I flipped the coin 10 times and this is the result:

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.
So how would I test if a coin is biased or unbiased?
We'll collect some data.
Experiment 1: I flipped the coin 10 times and this is the result:

The result is 7 head and 3 tails. So 70% are heads.

3/26

Testing coin bias Part 1/2

Suppose I have a coin that I'm going to flip
If the coin is unbiased, what is the probability it will show heads?
Yup, the probability should be 0.5.
So how would I test if a coin is biased or unbiased?
We'll collect some data.
Experiment 1: I flipped the coin 10 times and this is the result:

The result is 7 head and 3 tails. So 70% are heads.
Do you believe the coin is biased based on this data?

3/26

Testing coin bias Part 2/2

Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

4/26

Testing coin bias Part 2/2

Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

We observe 70 heads and 30 tails. So again 70% are heads.

4/26

Testing coin bias Part 2/2

Experiment 2: Suppose now I flip the coin 100 times and this is the outcome:

We observe 70 heads and 30 tails. So again 70% are heads.
Based on this data, do you think the coin is biased?

4/26

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.

Hypotheses
H0:p=0.5 vs. H1:p≠0.5

5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.

Hypotheses
H0:p=0.5 vs. H1:p≠0.5

Assumptions
Each toss is independent with equal chance of getting a head.

5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$
Assumptions	Each toss is independent with equal chance of getting a head.

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.

Hypotheses
H0:p=0.5 vs. H1:p≠0.5

Assumptions
Each toss is independent with equal chance of getting a head.

Test statistic
X∼B(n,p). Recall E(X)=np.
 The observed test statistic is denoted x.

5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$
Assumptions	Each toss is independent with equal chance of getting a head.
Test statistic	$X \sim B(n, p)$ . Recall $E(X) = np$ . The observed test statistic is denoted $x$ .

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.

Hypotheses
H0:p=0.5 vs. H1:p≠0.5

Assumptions
Each toss is independent with equal chance of getting a head.

Test statistic
X∼B(n,p). Recall E(X)=np.
 The observed test statistic is denoted x.

P-value 
(or critical value or confidence interval)
P(∣X−np∣≥∣x−np∣)

5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$
Assumptions	Each toss is independent with equal chance of getting a head.
Test statistic	$X \sim B(n, p)$ . Recall $E(X) = np$ . The observed test statistic is denoted $x$ .
P-value (or critical value or confidence interval)	$P(\mid X - np\mid \geq \mid x - np\mid )$

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.






Hypotheses
H0:p=0.5 vs. H1:p≠0.5

Assumptions
Each toss is independent with equal chance of getting a head.

Test statistic
X∼B(n,p). Recall E(X)=np.
 The observed test statistic is denoted x.

P-value 
(or critical value or confidence interval)
P(∣X−np∣≥∣x−np∣)

 Conclusion
Reject null hypothesis when the p-value is less than
 some significance level α. Usually α=0.05.

5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$
Assumptions	Each toss is independent with equal chance of getting a head.
Test statistic	$X \sim B(n, p)$ . Recall $E(X) = np$ . The observed test statistic is denoted $x$ .
P-value (or critical value or confidence interval)	$P(\mid X - np\mid \geq \mid x - np\mid )$
Conclusion	Reject null hypothesis when the $p$ -value is less than some significance level $\alpha$ . Usually $\alpha = 0.05$ .

(Frequentist) hypotheses testing frameworkSuppose X is the number of heads out of n independent tosses.
Let p be the probability of getting a head for this coin.






Hypotheses
H0:p=0.5 vs. H1:p≠0.5

Assumptions
Each toss is independent with equal chance of getting a head.

Test statistic
X∼B(n,p). Recall E(X)=np.
 The observed test statistic is denoted x.

P-value 
(or critical value or confidence interval)
P(∣X−np∣≥∣x−np∣)

 Conclusion
Reject null hypothesis when the p-value is less than
 some significance level α. Usually α=0.05.

The p-value for experiment 1 is P(|X−5|≥2)≈0.34.
The p-value for experiment 2 is P(|X−50|≥20)≈0.00008.
5/26


Hypotheses	$H_0: p = 0.5$ vs. $H_1: p \neq 0.5$
Assumptions	Each toss is independent with equal chance of getting a head.
Test statistic	$X \sim B(n, p)$ . Recall $E(X) = np$ . The observed test statistic is denoted $x$ .
P-value (or critical value or confidence interval)	$P(\mid X - np\mid \geq \mid x - np\mid )$
Conclusion	Reject null hypothesis when the $p$ -value is less than some significance level $\alpha$ . Usually $\alpha = 0.05$ .

Judicial system

6/26

Judicial system

Evidence by test statistic
Judgement by p-value, critical value or confidence interval

6/26

Judicial system

Evidence by test statistic
Judgement by p-value, critical value or confidence interval

Does the test statistic have to be a numerical summary statistics?

6/26

Visual inference7/26

Visual inference

Hypothesis testing in visual inference framework is where:
- the test statistic is a plot and
- judgement is by human perceptions.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Visual inference

Hypothesis testing in visual inference framework is where:
- the test statistic is a plot and
- judgement is by human perceptions.
You (and many other people) actually do visual inference many times but generally in an informal fashion.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Visual inference

Hypothesis testing in visual inference framework is where:
- the test statistic is a plot and
- judgement is by human perceptions.
You (and many other people) actually do visual inference many times but generally in an informal fashion.
Here, we are making an inference on whether the residual plot has any patterns based on a single data plot.

From Exercise 4 in week 9 tutorial: a residual plot after modelling high-density lipoprotein in human blood.

8/26

Data plots tend to be over-interpreted

Reading data plots require calibration

9/26

Visual inference more formallyState your null and alternate hypotheses.

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
Define a method to generate null data, \boldsymbol{y}_0. 

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
Define a method to generate null data, \boldsymbol{y}_0. 
V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
Define a method to generate null data, \boldsymbol{y}_0. 
V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots. 

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
Define a method to generate null data, \boldsymbol{y}_0. 
V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots. 
A lineup displays these m plots in a random order. 

10/26

Visual inference more formallyState your null and alternate hypotheses.
Define a visual test statistic, V(.), i.e. a function of a sample to a plot.
Define a method to generate null data, \boldsymbol{y}_0. 
V(\boldsymbol{y}) maps the actual data, \boldsymbol{y}, to the plot. We call this the data plot.
V(\boldsymbol{y}_0) maps a null data to a plot of the same form. We call this the null plot. We repeat this m - 1 times to generate m-1 null plots. 
A lineup displays these m plots in a random order. 
Ask n human viewers to select a plot in the lineup that looks different to others without any context given.

10/26

Visual inference more formally

State your null and alternate hypotheses.
Define a visual test statistic, $V(.)$ , i.e. a function of a sample to a plot.
Define a method to generate null data, $\boldsymbol{y}_0$ .
$V(\boldsymbol{y})$ maps the actual data, $\boldsymbol{y}$ , to the plot. We call this the data plot.
$V(\boldsymbol{y}_0)$ maps a null data to a plot of the same form. We call this the . We repeat this $m - 1$ times to generate $m-1$ null plots.
A displays these $m$ plots in a random order.
Ask $n$ human viewers to select a plot in the lineup that looks different to others without any context given.

Suppose $x$ out of $n$ people detected the data plot from a lineup, then

the is given as $P(X \geq x)$ where $X \sim B(n, 1/m)$ , and
the is estimated as $x/n$ .

10/26

Lineup 1 In which plot is the pink group higher than the blue group?

Note: there is no correct answer here.

11/26

Visual inference p-value (or "see"-value)So    x    out of    n    of you chose the data plot.
So the visual inference p-value is P(X \geq x) where X \sim B(n, 1/10).
In R, this is 1 - pbinom(x - 1, n, 1/10) 
# OR 
nullabor::pvisual(x, n, 10)

12/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx") 
skimr::skim(df)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df    
## Number of rows             24    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 12, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  4.92  1.25     3   4     5    6        7 ▃▇▇▆▃
## 2 wl2                   0             1  3.62  1.24     2   3     3.5  4.25     6 ▆▇▇▅▂
## 3 wl3                   0             1  2.17  1.13     1   1     2    3        4 ▇▅▁▅▃
## 4 se1                   0             1 14.8   2.11    11  13    15   16.2     19 ▃▆▂▇▁
## 5 se2                   0             1 14.0   2.35    11  11.8  14   15.2     19 ▇▇▆▅▂
## 6 se3                   0             1 15.6   2.45    11  14    15   18       19 ▃▃▆▃▇

gweight <- ggplot(df, aes(group, wl1, color = group)) + 
  ggbeeswarm::geom_quasirandom() + 
  labs(x = "", y = "Weight loss at 1 month") + 
  theme(text = element_text(size = 22)) + 
  guides(color = "none") +
  scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight

Is weight loss greater with diet after 1 month?

13/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx") 
skimr::skim(df)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df    
## Number of rows             24    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 12, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  4.92  1.25     3   4     5    6        7 ▃▇▇▆▃
## 2 wl2                   0             1  3.62  1.24     2   3     3.5  4.25     6 ▆▇▇▅▂
## 3 wl3                   0             1  2.17  1.13     1   1     2    3        4 ▇▅▁▅▃
## 4 se1                   0             1 14.8   2.11    11  13    15   16.2     19 ▃▆▂▇▁
## 5 se2                   0             1 14.0   2.35    11  11.8  14   15.2     19 ▇▇▆▅▂
## 6 se3                   0             1 15.6   2.45    11  14    15   18       19 ▃▃▆▃▇

gweight <- ggplot(df, aes(group, wl1, color = group)) + 
  ggbeeswarm::geom_quasirandom() + 
  labs(x = "", y = "Weight loss at 1 month") + 
  theme(text = element_text(size = 22)) + 
  guides(color = "none") +
  scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight

Is weight loss greater with diet after 1 month?

Group	N	Mean	Std. Dev
Control	12	4.50	1.00
Diet	12	5.33	1.37

13/26

Case study 1 Weight loss by diet

This is actually Plot 4 in the previous lineup.

data("WeightLoss", package = "carData")
# purposefully make it 2 groups
df <- filter(WeightLoss, group!="DietEx") 
skimr::skim(df)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df    
## Number of rows             24    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 12, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  4.92  1.25     3   4     5    6        7 ▃▇▇▆▃
## 2 wl2                   0             1  3.62  1.24     2   3     3.5  4.25     6 ▆▇▇▅▂
## 3 wl3                   0             1  2.17  1.13     1   1     2    3        4 ▇▅▁▅▃
## 4 se1                   0             1 14.8   2.11    11  13    15   16.2     19 ▃▆▂▇▁
## 5 se2                   0             1 14.0   2.35    11  11.8  14   15.2     19 ▇▇▆▅▂
## 6 se3                   0             1 15.6   2.45    11  14    15   18       19 ▃▃▆▃▇

gweight <- ggplot(df, aes(group, wl1, color = group)) + 
  ggbeeswarm::geom_quasirandom() + 
  labs(x = "", y = "Weight loss at 1 month") + 
  theme(text = element_text(size = 22)) + 
  guides(color = "none") +
  scale_color_manual(values = c("#006DAE", "#ee64a4"))
gweight

Is weight loss greater with diet after 1 month?

Group	N	Mean	Std. Dev
Control	12	4.50	1.00
Diet	12	5.33	1.37

with(df, 
     t.test(wl1[group=="Diet"], wl1[group=="Control"], 
            alternative = "greater"))

## 
##     Welch Two Sample t-test
## 
## data:  wl1[group == "Diet"] and wl1[group == "Control"]
## t = 1.7014, df = 20.125, p-value = 0.05213
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -0.01117097         Inf
## sample estimates:
## mean of x mean of y 
##  5.333333  4.500000

13/26

Null data generation methodWe are testing H_0: \mu_{diet} = \mu_{control} vs. H_1: \mu_{diet} > \mu_{control} where \mu_{diet} and \mu_{control} are the average weight loss for population on diet and no diet, respectively.

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.
- we could assume a parametric distribution of the data and estimate the parameters from the data, or

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.
- we could assume a parametric distribution of the data and estimate the parameters from the data, or
- we could permute the labels for the diet and control group.

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.
- we could assume a parametric distribution of the data and estimate the parameters from the data, or
- we could permute the labels for the diet and control group.

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.
- we could assume a parametric distribution of the data and estimate the parameters from the data, or
- we could permute the labels for the diet and control group.

14/26

Null data generation method

We are testing $H_0: \mu_{diet} = \mu_{control}$ vs. $H_1: \mu_{diet} > \mu_{control}$ where $\mu_{diet}$ and $\mu_{control}$ are the average weight loss for population on diet and no diet, respectively.
There are a number of ways to generate null data under $H_0$ , e.g.
- we could assume a parametric distribution of the data and estimate the parameters from the data, or
- we could permute the labels for the diet and control group.

14/26

Lineup 2 In which plot is the pink group higher than the blue group?

15/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df2   
## Number of rows             22    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 10, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  5.27  1.86     3   4     5    6        9 ▇▃▃▂▂
## 2 wl2                   0             1  4.59  1.87     2   3     4.5  5.75     9 ▇▅▇▃▁
## 3 wl3                   0             1  2.14  1.17     1   1     2    3        4 ▇▅▁▃▃
## 4 se1                   0             1 15.0   1.65    11  14    15   16       17 ▁▃▂▃▇
## 5 se2                   0             1 13.9   1.93    11  12.2  13.5 15       18 ▇▇▇▃▂
## 6 se3                   0             1 16.2   2.22    11  15    17   18       19 ▂▂▃▇▇

gweight %+% df2 + 
  aes(y = wl2) +
  labs(y = "Weight loss at 2 months")

Is weight loss greater with diet and exercise after 2 months?

16/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df2   
## Number of rows             22    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 10, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  5.27  1.86     3   4     5    6        9 ▇▃▃▂▂
## 2 wl2                   0             1  4.59  1.87     2   3     4.5  5.75     9 ▇▅▇▃▁
## 3 wl3                   0             1  2.14  1.17     1   1     2    3        4 ▇▅▁▃▃
## 4 se1                   0             1 15.0   1.65    11  14    15   16       17 ▁▃▂▃▇
## 5 se2                   0             1 13.9   1.93    11  12.2  13.5 15       18 ▇▇▇▃▂
## 6 se3                   0             1 16.2   2.22    11  15    17   18       19 ▂▂▃▇▇

gweight %+% df2 + 
  aes(y = wl2) +
  labs(y = "Weight loss at 2 months")

Is weight loss greater with diet and exercise after 2 months?

Group	N	Mean	Std. Dev
Control	12	3.33	1.07
DietEx	10	6.10	1.45

16/26

Case study 1 Weight loss by diet and exercise

This is actually Plot 10 in the previous lineup.

df2 <- filter(WeightLoss, group!="Diet")
skimr::skim(df2)

## ── Data Summary ────────────────────────
##                            Values
## Name                       df2   
## Number of rows             22    
## Number of columns          7     
## _______________________          
## Column type frequency:           
##   factor                   1     
##   numeric                  6     
## ________________________         
## Group variables            None  
## 
## ── Variable type: factor ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate ordered n_unique top_counts              
## 1 group                 0             1 FALSE          2 Con: 12, Die: 10, Die: 0
## 
## ── Variable type: numeric ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##   skim_variable n_missing complete_rate  mean    sd    p0   p25   p50   p75  p100 hist 
## 1 wl1                   0             1  5.27  1.86     3   4     5    6        9 ▇▃▃▂▂
## 2 wl2                   0             1  4.59  1.87     2   3     4.5  5.75     9 ▇▅▇▃▁
## 3 wl3                   0             1  2.14  1.17     1   1     2    3        4 ▇▅▁▃▃
## 4 se1                   0             1 15.0   1.65    11  14    15   16       17 ▁▃▂▃▇
## 5 se2                   0             1 13.9   1.93    11  12.2  13.5 15       18 ▇▇▇▃▂
## 6 se3                   0             1 16.2   2.22    11  15    17   18       19 ▂▂▃▇▇

gweight %+% df2 + 
  aes(y = wl2) +
  labs(y = "Weight loss at 2 months")

Is weight loss greater with diet and exercise after 2 months?

Group	N	Mean	Std. Dev
Control	12	3.33	1.07
DietEx	10	6.10	1.45

with(df2, 
     t.test(wl2[group=="DietEx"], wl2[group=="Control"], 
            alternative = "greater"))

## 
##     Welch Two Sample t-test
## 
## data:  wl2[group == "DietEx"] and wl2[group == "Control"]
## t = 5.0018, df = 16.317, p-value = 6.155e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  1.802104      Inf
## sample estimates:
## mean of x mean of y 
##  6.100000  3.333333

16/26

What about if we change the visual test statistic?17/26

Lineup 3 In which plot is the pink group higher than the blue group?

geom_point() Data plot is Plot 3

18/26

Lineup 4 In which plot is the pink group higher than the blue group?

geom_boxplot() Data plot is Plot 2

19/26

Lineup 5 In which plot is the pink group higher than the blue group?

geom_violin() Data plot is Plot 4

20/26

Lineup 6 In which plot is the pink group higher than the blue group?

ggbeeswarm::geom_quasirandom() Data plot is Plot 10

21/26

Case study 1 Weight loss by exercise

df3 <- filter(WeightLoss, group != "Control")

gweight %+% df3 + 
  aes(y = wl2) +
  labs(y = "Weight loss at 2 months")

Is weight loss greater with exercise after 2 months?

Group	N	Mean	Std. Dev
Diet	12	3.92	1.38
DietEx	10	6.10	1.45

with(df3, 
     t.test(wl2[group=="DietEx"], wl2[group=="Diet"], 
            alternative = "greater"))

## 
##     Welch Two Sample t-test
## 
## data:  wl2[group == "DietEx"] and wl2[group == "Diet"]
## t = 3.5969, df = 18.901, p-value = 0.0009675
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  1.133454      Inf
## sample estimates:
## mean of x mean of y 
##  6.100000  3.916667

22/26

Power of a lineup

The power of a lineup is calculated as $x/n$ where $x$ is the number of people who detected the data plot out of $n$ people

Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. “Graphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

23/26

Power of a lineup

The power of a lineup is calculated as $x/n$ where $x$ is the number of people who detected the data plot out of $n$ people

Plot type	$x$	$n$	Power
`geom_point`	$x_1$	$n_1$	$x_1 / n_1$
`geom_boxplot`	$x_2$	$n_2$	$x_2 / n_2$
`geom_violin`	$x_3$	$n_3$	$x_3 / n_3$
`ggbeeswarm::geom_quasirandom`	$x_4$	$n_4$	$x_4 / n_4$

Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. “Graphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

23/26

Power of a lineup

The power of a lineup is calculated as $x/n$ where $x$ is the number of people who detected the data plot out of $n$ people

Plot type	$x$	$n$	Power
`geom_point`	$x_1$	$n_1$	$x_1 / n_1$
`geom_boxplot`	$x_2$	$n_2$	$x_2 / n_2$
`geom_violin`	$x_3$	$n_3$	$x_3 / n_3$
`ggbeeswarm::geom_quasirandom`	$x_4$	$n_4$	$x_4 / n_4$

Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. “Graphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

The plot type with a higher power is preferable

23/26

Power of a lineup

The power of a lineup is calculated as $x/n$ where $x$ is the number of people who detected the data plot out of $n$ people

Plot type	$x$	$n$	Power
`geom_point`	$x_1$	$n_1$	$x_1 / n_1$
`geom_boxplot`	$x_2$	$n_2$	$x_2 / n_2$
`geom_violin`	$x_3$	$n_3$	$x_3 / n_3$
`ggbeeswarm::geom_quasirandom`	$x_4$	$n_4$	$x_4 / n_4$

Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. “Graphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.

The plot type with a higher power is preferable
You can use this framework to find the optimal plot design

23/26

Some considerations in visual inference

In practice you don't want to bias the judgement of the human viewers so for a proper visual inference:
- you should not show the data plot before the lineup
- you should not give the context of the data
- you should remove labels in plots
You can crowd source these by paying for services like:
If the data is for research purposes, then you may need ethics approval for publication.

24/26

Resources and Acknowledgement

Buja, Andreas, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F. Swayne, and Hadley Wickham. 2009. “Statistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 367 (1906): 4361–83.
Wickham, Hadley, Dianne Cook, Heike Hofmann, and Andreas Buja. 2010. “Graphical Inference for Infovis.” IEEE Transactions on Visualization and Computer Graphics 16 (6): 973–79.
Hofmann, H., L. Follett, M. Majumder, and D. Cook. 2012. “Graphical Tests for Power Comparison of Competing Designs.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2441–48.
Majumder, M., Heiki Hofmann, and Dianne Cook. 2013. “Validation of Visual Statistical Inference, Applied to Linear Models.” Journal of the American Statistical Association 108 (503): 942–56.
Data coding using tidyverse suite of R packages
Slides constructed with xaringan, remark.js, knitr, and R Markdown.

25/26

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Lecturer: Emi Tanaka

ETC5521.Clayton-x@monash.edu

Week 11 - Session 1

26/26

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

ETC5521: Exploratory Data Analysis

Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly

Revisiting hypothesis testing

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 1/2

Testing coin bias Part 2/2

Testing coin bias Part 2/2

Testing coin bias Part 2/2

(Frequentist) hypotheses testing framework

(Frequentist) hypotheses testing framework

(Frequentist) hypotheses testing framework

(Frequentist) hypotheses testing framework

(Frequentist) hypotheses testing framework

(Frequentist) hypotheses testing framework

Judicial system

Judicial system

Judicial system

Visual inference

Visual inference

Visual inference

Visual inference

Visual inference more formally

Visual inference more formally

Visual inference more formally

Visual inference more formally

Visual inference more formally

Visual inference more formally

Visual inference more formally

Visual inference more formally

Lineup 1 In which plot is the pink group higher than the blue group?

Visual inference p-value (or "see"-value)

Case study 1 Weight loss by diet

Case study 1 Weight loss by diet

Case study 1 Weight loss by diet

Null data generation method

Null data generation method

Null data generation method

Null data generation method

Null data generation method

Null data generation method

Null data generation method

Lineup 2 In which plot is the pink group higher than the blue group?

Case study 1 Weight loss by diet and exercise

Case study 1 Weight loss by diet and exercise

Case study 1 Weight loss by diet and exercise

What about if we change the visual test statistic?

Lineup 3 In which plot is the pink group higher than the blue group?

Lineup 4 In which plot is the pink group higher than the blue group?

Lineup 5 In which plot is the pink group higher than the blue group?

Lineup 6 In which plot is the pink group higher than the blue group?

Case study 1 Weight loss by exercise

Power of a lineup

Power of a lineup

Power of a lineup

Power of a lineup

Some considerations in visual inference

Resources and Acknowledgement

ETC5521: Exploratory Data Analysis

Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly

Help

Revisiting
hypothesis testing