Advent of “Grammar”

background-color: #006DAE
class: middle center hide-slide-number

<div class="shade_black"  style="width:60%;right:0;bottom:0;padding:10px;border: dashed 4px white;margin: auto;">
<i class="fas fa-exclamation-circle"></i> These slides are viewed best by Chrome and occasionally need to be refreshed if elements did not load properly. See <a href=edibble-slides.pdf>here for PDF <i class="fas fa-file-pdf"></i></a>. 
</div>

<br>

---

background-image: url(images/bg1.jpg)
background-size: cover
class: hide-slide-number split-70 title-slide
count: false

.column.shade_black[.content[

<br>

# .monash-blue.outline-text[Advent of "Grammar"]

<br>

<h2 style="font-weight:900!important;">Bridging Statistics and Data Science for Experimental Design</h2>

.bottom_abs.width100[

**Emi Tanaka**

Department of Econometrics and Business Statistics

<span><i class="fas  fa-envelope faa-float animated "></i></span>  emi.tanaka@monash.edu
<span><i class="fab  fa-twitter faa-float animated "></i></span>  @statsgen

15th July 2020 @ Monash Informal Bioinfo Seminar

<br>
]

]]

---

# .circle-big[1]

# Grammar of Graphics

<i class="fab fa-r-project blue"></i> `ggplot2` 📦

.footnote.monash-bg-blue[
Wickham (2016) ggplot2: Elegant Graphics for Data Analysis. *Springer-Verlag New York*
]

---

# What are the differences between these plots?

]
.item[

![](images/unnamed-chunk-2-1.png)

]
]

<div style="position:absolute; bottom:10%;left:30%;">
How do you construct these plots?
</div>

---

# Using R: `base`

```r
df
```

```
##       duty perc
## 1 Teaching   40
## 2 Research   40
## 3    Admin   20
```
]
.item.border-right[

```r
barplot(as.matrix(df$perc),
        legend = df$duty)
```

![](images/barplot-1.png)

]
.item[

```r
pie(df$perc, labels = df$duty)
```

![](images/pie-1.png)

]
]

.footnote[
R Core Team (2020) R: A Language and Environment for Statistical Computing https://www.R-project.org/
]

<div class="corner-box" style="bottom:50px;">
<ul>
<li><b>Single purpose functions</b> to generate "named plots"</li>
<li><b>Input</b> varies, here it is vector or matrix</li>
</ul>
</div>

---

# Using R: `ggplot2`

```r
df
```

```
##       duty perc
## 1 Teaching   40
## 2 Research   40
## 3    Admin   20
```
]
.item.border-right[

```r
ggplot(df, aes(x = "", # dummy
               y = perc, 
               fill = duty)) + 
  geom_col()
```

![](images/ggbarplot-1.png)

]
.item[

]
]

.footnote[
Wilkinson (2005) The Grammar of graphics. *Statistics and Computing. Springer, 2nd edition.*

Wickham (2008) Practical Tools for Exploring Data and Models. *PhD Thesis Chapter 3: A layered grammar of graphics*.

Wickham (2010) A Layered Grammar of Graphics, *Journal of Computational and Graphical Statistics, 19:1, 3-28*

]

```r
ggplot(df, aes(x = "", # dummy
               y = perc, 
               fill = duty)) + 
  geom_col() + 
* coord_polar(theta = "y")
```

![](images/ggpie-1.png)

.corner-box[
* `ggplot2` implements the **grammar of graphics**
* the difference between a **stacked barplot** and a **pie chart** is that the coordinate system have been transformed from **Cartesian coordinate** to **polar coordinate**
]

---

# Data <i class="fas fa-exchange-alt"></i> Plot

* `ggplot` uses **tidy data** as input so plot construction is enforced by consistent thinking in relation to tidy data
<details style="font-size:15pt;">
<summary>Tidy data principles</summary>
<ol>
<li>Each variable must have its own column.</li>
<li>Each observation must have its own row.</li>
<li>Each value must have its own cell.</li>
</ol>
</details>
* Variables are mapped to a plot aesthetic
* Plots are constructed from its components expressed by the **grammar of graphics**

]
.item[

]
]

]

---

# <i class="fas fa-puzzle-piece"></i> What graph will this yield?

```r
df2
```

```
## # A tibble: 6 x 3
##   duty      perc type    
##   <chr>    <dbl> <chr>   
## 1 Teaching    40 standard
## 2 Research    40 standard
## 3 Admin       20 standard
## 4 Teaching    80 teaching
## 5 Research     0 teaching
## 6 Admin       20 teaching
```

]
.item.border-right[

```r
g <- ggplot(df2, 
*       aes(x = type,
            y = perc, 
            fill = duty)) + 
        geom_col()
g
```

]
.item[

```r
*g + coord_polar("y")
```

]
]

---

# <i class="fas fa-puzzle-piece"></i> What graph will this yield?

```r
df2
```

]
.item.border-right[

```r
g <- ggplot(df2, 
*       aes(x = type,
            y = perc, 
            fill = duty)) + 
        geom_col()
g
```

![](images/barplot2-1.png)

]
.item[

```r
*g + coord_polar("y")
```

]
]

---

# <i class="fas fa-puzzle-piece"></i> What graph will this yield?

```r
df2
```

]
.item.border-right[

```r
g <- ggplot(df2, 
*       aes(x = type,
            y = perc, 
            fill = duty)) + 
        geom_col()
g
```

![](images/barplot2-1.png)

]
.item[

```r
*g + coord_polar("y")
```

![](images/pie2-1.png)

]
]

```r
g + coord_polar("x")
```

![](images/pie2x-1.png)

.corner-box[
* .yellow[**Modifiable**]: `ggplot` object can be modified
* .yellow[**Generalisable**]: `ggplot2` uses a cohesive and complex system under the hood to make many kinds of plots
* .yellow[**Extensible**]: the system can be extended to make specialised plots or add more features if the same "grammar" is adopted
]

---

# .circle-big[2]

# Grammar of <br>Data Manipulation

<i class="fab fa-r-project blue"></i> `dplyr` 📦

.footnote.monash-bg-blue[
Wickham, François, Henry & Müller (2020) dplyr: A Grammar of Data Manipulation. R-package version 1.0.0.
]

---

# <i class="fas fa-ship"></i> Caught in the Wild: Wrangling Fisheries Data

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
for(i in 1:dim(samples)[1])
   if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA
samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
samp.cnt <- samp.cnt[samp.cnt>=10]		
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

.footnote[
Bax & Williams (2000). Habitat and Fisheries Production in the South East Fishery Ecosystem. *Technical report. 94/040.*
]

.corner-box[
* Cleaning data is an important aspect of statistical work
* Providing code for reproducibility is important
* So... what are these lines of code doing?
]

---

# Data Manipulation: `base` & `dplyr`

]
.item50[

]
]

---

# Data Manipulation: `base` & `dplyr`

```r
*samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
*samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
for(i in 1:dim(samples)[1])
   if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA
samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
samp.cnt <- samp.cnt[samp.cnt>=10]		
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

]
.item50[

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL")
```

]
]

---

# Data Manipulation: `base` & `dplyr`

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
*taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
*taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
for(i in 1:dim(samples)[1])
   if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA
samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
samp.cnt <- samp.cnt[samp.cnt>=10]		
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

]
.item50[

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL")

*taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>%
* filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000)
```

]
]

---

# Data Manipulation: `base` & `dplyr`

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
*for(i in 1:dim(samples)[1])
*  if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA
*samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
samp.cnt <- samp.cnt[samp.cnt>=10]		
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

]
.item50[

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL")

taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>% 
  filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000) 
```

<br>

]

]
]

The intention here is that if a `SEF_SPCODE` is not within the `taxa` of interest, then you want to remove the corresponding sample.

<span style="font-size:18pt">(I did not make up this code. It is a code used in practice, but just some code are harder to read for others.)</span>

---

# Data Manipulation: `base` & `dplyr`

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
*for(i in 1:dim(samples)[1])
*  if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA
*samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
samp.cnt <- samp.cnt[samp.cnt>=10]		
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

]
.item50[

```r
taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>% 
  filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000)

samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL") %>% 
* filter(SEF_SPCODE %in% taxa$SEF_SPCODE)
```

* Note: this is not that coding in `base` is bad!
* In this example, you can also write a more simple `base` code too.

]
]

---

# Data Manipulation: `base` & `dplyr`

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
for(i in 1:dim(samples)[1]) 
   if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA 
samples <- samples[!is.na(samples$SEF_SPCODE),]
*samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length)
*samp.cnt <- samp.cnt[samp.cnt>=10]
## taxa data
for(i in 1:dim(taxa)[1])
   if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
taxa <- na.omit(taxa)
```
]

]
.item50[

```r
taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>% 
  filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000)

samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL") %>% 
  filter(SEF_SPCODE %in% taxa$SEF_SPCODE)

*spcode_ge10_samples <- samples %>%
* group_by(SEF_SPCODE) %>%
* summarise(n = n()) %>% # or just `tally()`
* filter(n >= 10) %>%
* pull(SEF_SPCODE)
```

In my opinion, `group` + `summarise` is the most powerful reason to use `dplyr` over `base` counterparts.

]
]

---

# Data Manipulation: `base` & `dplyr`

```r
samples <- foreign::read.dbf("data/SEF_SAMP.DBF")
samples <- samples[samples$GEAR_NAME=="McKenna trawl" & samples$GEAR_TYPE=="FISH TRAWL",]
## taxa data
taxa<- foreign::read.dbf("data/SEF_TAXA.DBF")
taxa <- taxa[taxa$SEF_SPCODE > 37000000 & taxa$SEF_SPCODE < 38000000 ,]
## clean sample data
for(i in 1:dim(samples)[1]) 
   if(!any(taxa$SEF_SPCODE==samples$SEF_SPCODE[i])) samples[i,] <- NA 
samples <- samples[!is.na(samples$SEF_SPCODE),]
samp.cnt <- tapply(samples$SEF_SPCODE,samples$SEF_SPCODE,length) 
samp.cnt <- samp.cnt[samp.cnt>=10]		 
## taxa data
*for(i in 1:dim(taxa)[1])
*  if(!any(as.numeric(names(samp.cnt))==taxa$SEF_SPCODE[i])) taxa[i,] <- NA
*taxa <- na.omit(taxa)
```
]

]
.item50[

```r
taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>% 
  filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000)

samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL") %>% 
  filter(SEF_SPCODE %in% taxa$SEF_SPCODE)

spcode_ge10_samples <- samples %>% 
  group_by(SEF_SPCODE) %>% 
  summarise(n = n()) %>% # or just `tally()` 
  filter(n >= 10) %>% 
  pull(SEF_SPCODE)

*taxa2 <- taxa %>%
* filter(SEF_SPCODE %in% spcode_ge10_samples)
```

]
]

---

# Reading code like English grammar

```r
taxa <- foreign::read.dbf("data/SEF_TAXA.DBF") %>% 
  filter(SEF_SPCODE > 37000000 & SEF_SPCODE < 38000000)

samples <- foreign::read.dbf("data/SEF_SAMP.DBF") %>%
  filter(GEAR_NAME=="McKenna trawl" & GEAR_TYPE=="FISH TRAWL") %>% 
  filter(SEF_SPCODE %in% taxa$SEF_SPCODE)

spcode_ge10_samples <- samples %>% 
  group_by(SEF_SPCODE) %>% 
  summarise(n = n()) %>% # or just `tally()` 
  filter(n >= 10) %>% 
  pull(SEF_SPCODE)

taxa2 <- taxa %>%  
  filter(SEF_SPCODE %in% spcode_ge10_samples) 
```

No intermediate naming to think or worry about!

]
.item[

* Think of `%>%` as "then".
* The code may read more familiar to an English user without much specialist knowledge.
* Of course this is biased in favour of people who know English!

]
]

---

# .circle-big[3]

# Grammar of <br>Genomic Data Transformation

<i class="fab fa-r-project blue"></i> `plyranges` 📦

.footnote.monash-bg-blue[
Lee, Cook & Lawrence (2019) plyranges: a grammar of genomic data transformation. *Genome Biology 20:1*.
]

---

# <i class="fas fa-dna"></i> Genomic Data from `HelloRangesData`

```r
exons
```

```
## GRanges object with 459752 ranges and 3 metadata columns:
##            seqnames            ranges strand |                               name     score       tx_id
##               <Rle>         <IRanges>  <Rle> |                        <character> <numeric> <character>
##        [1]     chr1       11874-12227      + |    NR_046018_exon_0_0_chr1_11874_f         0   NR_046018
##        [2]     chr1       12613-12721      + |    NR_046018_exon_1_0_chr1_12613_f         0   NR_046018
##        [3]     chr1       13221-14409      + |    NR_046018_exon_2_0_chr1_13221_f         0   NR_046018
##        [4]     chr1       14362-14829      - |    NR_024540_exon_0_0_chr1_14362_r         0   NR_024540
##        [5]     chr1       14970-15038      - |    NR_024540_exon_1_0_chr1_14970_r         0   NR_024540
##        ...      ...               ...    ... .                                ...       ...         ...
##   [459748]     chrY 59338754-59338859      + | NM_002186_exon_6_0_chrY_59338754_f         0   NM_002186
##   [459749]     chrY 59338754-59338859      + | NM_176786_exon_7_0_chrY_59338754_f         0   NM_176786
##   [459750]     chrY 59340194-59340278      + | NM_002186_exon_7_0_chrY_59340194_f         0   NM_002186
##   [459751]     chrY 59342487-59343488      + | NM_002186_exon_8_0_chrY_59342487_f         0   NM_002186
##   [459752]     chrY 59342487-59343488      + | NM_176786_exon_8_0_chrY_59342487_f         0   NM_176786
##   -------
##   seqinfo: 93 sequences from an unspecified genome; no seqlengths
```

]
.item50[

```r
gwas
```

```
## GRanges object with 17680 ranges and 1 metadata column:
##           seqnames    ranges strand |        name
##              <Rle> <IRanges>  <Rle> | <character>
##       [1]     chr1   1005806      * |   rs3934834
##       [2]     chr1   1079198      * |  rs11260603
##       [3]     chr1   1247494      * |     rs12103
##       [4]     chr1   2069172      * |    rs425277
##       [5]     chr1   2069681      * |   rs3753242
##       ...      ...       ...    ... .         ...
##   [17676]     chrX 154014107      * |   rs5987027
##   [17677]     chrX 154014107      * |   rs5987027
##   [17678]     chrX 154233774      * |  rs17281398
##   [17679]     chrY    940180      * |   rs4129148
##   [17680]     chrY   6477300      * |   rs5941160
##   -------
##   seqinfo: 93 sequences from an unspecified genome; no seqlengths
```

]
]

---

# <i class="fas fa-code"></i> `library(plyranges)`

```r
join_overlap_inner(gwas, exons)
```

```
## GRanges object with 3439 ranges and 4 metadata columns:
##          seqnames    ranges strand |      name.x                                 name.y     score        tx_id
##             <Rle> <IRanges>  <Rle> | <character>                            <character> <numeric>  <character>
##      [1]     chr1   1079198      * |  rs11260603      NR_038869_exon_2_0_chr1_1078119_f         0    NR_038869
##      [2]     chr1   1247494      * |     rs12103   NM_001256456_exon_1_0_chr1_1247398_r         0 NM_001256456
##      [3]     chr1   1247494      * |     rs12103   NM_001256460_exon_1_0_chr1_1247398_r         0 NM_001256460
##      [4]     chr1   1247494      * |     rs12103   NM_001256462_exon_1_0_chr1_1247398_r         0 NM_001256462
##      [5]     chr1   1247494      * |     rs12103   NM_001256463_exon_1_0_chr1_1247398_r         0 NM_001256463
##      ...      ...       ...    ... .         ...                                    ...       ...          ...
##   [3435]     chrX 153764217      * |   rs1050828 NM_001042351_exon_9_0_chrX_153764152_r         0 NM_001042351
##   [3436]     chrX 153764217      * |   rs1050828    NM_000402_exon_9_0_chrX_153764152_r         0    NM_000402
##   [3437]     chrX 153764217      * |   rs1050828 NM_001042351_exon_9_0_chrX_153764152_r         0 NM_001042351
##   [3438]     chrX 153764217      * |   rs1050828    NM_000402_exon_9_0_chrX_153764152_r         0    NM_000402
##   [3439]     chrX 153764217      * |   rs1050828 NM_001042351_exon_9_0_chrX_153764152_r         0 NM_001042351
##   -------
##   seqinfo: 93 sequences from an unspecified genome; no seqlengths
```
]
.item50[

Generate 2bp splice sites on either side of the exons:

```r
interweave(flank_left(exons, 2L),
           flank_right(exons, 2L), 
           .id = "side")
```

```
## GRanges object with 919504 ranges and 4 metadata columns:
##            seqnames            ranges strand |                               name     score       tx_id        side
##               <Rle>         <IRanges>  <Rle> |                        <character> <numeric> <character> <character>
##        [1]     chr1       11872-11873      + |    NR_046018_exon_0_0_chr1_11874_f         0   NR_046018        left
##        [2]     chr1       12228-12229      + |    NR_046018_exon_0_0_chr1_11874_f         0   NR_046018       right
##        [3]     chr1       12611-12612      + |    NR_046018_exon_1_0_chr1_12613_f         0   NR_046018        left
##        [4]     chr1       12722-12723      + |    NR_046018_exon_1_0_chr1_12613_f         0   NR_046018       right
##        [5]     chr1       13219-13220      + |    NR_046018_exon_2_0_chr1_13221_f         0   NR_046018        left
##        ...      ...               ...    ... .                                ...       ...         ...         ...
##   [919500]     chrY 59340279-59340280      + | NM_002186_exon_7_0_chrY_59340194_f         0   NM_002186       right
##   [919501]     chrY 59342485-59342486      + | NM_002186_exon_8_0_chrY_59342487_f         0   NM_002186        left
##   [919502]     chrY 59343489-59343490      + | NM_002186_exon_8_0_chrY_59342487_f         0   NM_002186       right
##   [919503]     chrY 59342485-59342486      + | NM_176786_exon_8_0_chrY_59342487_f         0   NM_176786        left
##   [919504]     chrY 59343489-59343490      + | NM_176786_exon_8_0_chrY_59342487_f         0   NM_176786       right
##   -------
##   seqinfo: 93 sequences from an unspecified genome; no seqlengths
```

]
]

---

# Data Science

is ...

]
.item[

]
]

<h1>Statistics for People?<h1>

---

# Data Science

is ...

# Statistics for People?

]
.item[

]
]

<div class="corner-box" style = "left:38%;width:57%;bottom:40px;">
<b>Reality:</b>
<ul>
<li>Most who use statistics are <i>not</i> trained foremost as a statistician.</li>
<li>A lot of people need to use statistics.</li>
</ul>
</div>

.footnote.monash-bg-green2[
Wickham (2015) Teaching Safe-Stats, Not Statistical Abstinence. *The American Statistician, Online Discussion*
]

---

# Data Science

is ...

# Statistics for People?

]
.item[

]
]

<div class="corner-box" style = "left:38%;width:57%;bottom:40px;">
<ul>
If statistical tools <span class="yellow"><b>leverage cognition</b></span> of an everyday person or an average statistics user & <span class="yellow"><b>provide a cohesive and consistent system</span></b>, rather than learn new, specialised or ad-hoc methods, wouldn't that be helpful in <span class="yellow"><b>making statistics accessible</span></b> and <span class="yellow"><b>promote statistical literacy</span></b>?
</ul>
</div>

.footnote.monash-bg-green2[
Wickham & Grolemund (2017) R for Data Science: Import, Tidy, Transform, Visualize, and Model. *O'Reilly Media, Inc.*
]

---

# .circle-big[4]

# Grammar of <br>Experimental Design

<i class="fab fa-r-project blue"></i> `edibble` 📦

(Work-In-Progress)

---

# Typical course in experimental design <br>.font_small[(at least at University of Sydney in 2017-2019)]

Teach:

* Completely Randomised Design
* Randomised Complete Block Design
* Latin Square Design
* Balanced Incomplete Block Design
* Factorial Design
* <strike> 2</strike><sup>k</sup><strike> Factorial Design</strike> .font_small[(I removed this from 2018, I won't talk about this today)]
* Split-plot Design .font_small[(I added this from 2018 among other concepts)]

---

# Completely Randomised Design (CRD)

<br>

]

* `$t$` treatments randomised to `$n$` units

<br>

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{treatment} + \texttt{error}$$`

]

---

# Randomised Complete Block Design (RCBD)

<br>

]

* `$b$` blocks of size `$t$`
* `$t$` treatments randomised to `$t$` units within each block

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{treatment} + \texttt{block} + \texttt{error}$$`

]

---

# Latin Square Design (LSD)

<br>

]

* two orthogonal blocks of size `$t$`
* `$t$` treatments randomised to units such that every treatment appears exactly once in each block

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{treatment} + \texttt{row} + \texttt{column} + \texttt{error}$$`

]

---

# Balanced Incomplete Block Design (BIBD)
 
.grid[

<br>

]

* `$b$` blocks of size `$k < t$`
* `$t$` treatments randomised to units within each block such that every pair of treatment appears the same number of times across blocks

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{block} + \texttt{treatment} + \texttt{error}$$`

]

---

# Factorial Design

<br>

]

* `$ab = t$` treatments randomised to `$n$` units
* treatment is every combination of two factors A and B

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{A} + \texttt{B} + \texttt{A:B} + \texttt{error}$$`

<center>
<img src="images/factorial-eg1-anova-top.png" width = "850px"/>
<details style="font-size:4pt"><summary></summary>
<img src="images/factorial-eg1-anova-middle.png" width = "850px"/>
</details>
<img src="images/factorial-eg1-anova-bottom.png" width = "850px"/>
</center>
]

]

---

# Split-plot Design

<br>

]

<ul>
<li> $n_1$ whole plots consisting of $b$ sub plots</li>
<li>in total there are $n$ sub plots</li>
<li>treatment factor A is randomised to whole plots</li>
<li>treatment factor B is randomised to sub plots within each whole plot</li>
</ul>

`$$\scriptsize \texttt{observation} = \texttt{mean} + \texttt{A} + \texttt{WP} + \texttt{B} + \texttt{A:B} + \texttt{error}$$`

<center>
<img src="images/split-plot-eg1-anova.png" width = "850px"/>
</center>
]

</div>

]

---

# CRAN Task View of Design of Experiments

contains

# 📦 .yellow[229 R-packages ]

---

# Top 10 downloaded R-packages in 2018

![](images/top10-1.png)

# `agricolae` is the most downloaded

.font_small[Please note `agricolae` imports `AlgDesign` and recent data has `AlgDesign` as top followed by `agricolae`. ]

---

# `agricolae::design.crd`

.blue[**Completely randomised design**] for `$t = 3$` treatments with `$2$` replicates each
<pre><code>
trt <- c("A", "B", "C")
agricolae::.bg-yellow[design.crd](trt = trt, r = 2) %>% glimpse()
</code></pre>

```
## List of 2
##  $ parameters:List of 7
##   ..$ design: chr "crd"
##   ..$ trt   : chr [1:3] "A" "B" "C"
##   ..$ r     : num [1:3] 2 2 2
##   ..$ serie : num 2
##   ..$ seed  : int 396269021
##   ..$ kinds : chr "Super-Duper"
##   ..$       : logi TRUE
##  $ book      :'data.frame':	6 obs. of  3 variables:
##   ..$ plots: num [1:6] 101 102 103 104 105 106
##   ..$ r    : int [1:6] 1 1 2 1 2 2
##   ..$ trt  : chr [1:6] "C" "A" "C" "B" ...
```
]

</div>

---

# `agricolae::design.rcbd`

<pre><code>
trt <- c("A", "B", "C")
agricolae::.bg-yellow[design.rcbd](trt = trt, r = 2) %>% glimpse()
</code></pre>

```
## List of 3
##  $ parameters:List of 7
##   ..$ design: chr "rcbd"
##   ..$ trt   : chr [1:3] "A" "B" "C"
##   ..$ r     : num 2
##   ..$ serie : num 2
##   ..$ seed  : int 973575053
##   ..$ kinds : chr "Super-Duper"
##   ..$       : logi TRUE
##  $ sketch    : chr [1:2, 1:3] "C" "A" "B" "C" ...
##  $ book      :'data.frame':	6 obs. of  3 variables:
##   ..$ plots: num [1:6] 101 102 103 201 202 203
##   ..$ block: Factor w/ 2 levels "1","2": 1 1 1 2 2 2
##   ..$ trt  : Factor w/ 3 levels "A","B","C": 3 2 1 1 3 2
```
]

<br>

</div>

---

# `agricolae::design.lsd()`

<pre><code>
trt <- c("A", "B", "C")
agricolae::.bg-yellow[design.lsd](trt = trt) %>% glimpse()
</code></pre>

```
## List of 3
##  $ parameters:List of 7
##   ..$ design: chr "lsd"
##   ..$ trt   : chr [1:3] "A" "B" "C"
##   ..$ r     : int 3
##   ..$ serie : num 2
##   ..$ seed  : int -1984440067
##   ..$ kinds : chr "Super-Duper"
##   ..$       : logi TRUE
##  $ sketch    : chr [1:3, 1:3] "B" "C" "A" "A" ...
##  $ book      :'data.frame':	9 obs. of  4 variables:
##   ..$ plots: num [1:9] 101 102 103 201 202 203 301 302 303
##   ..$ row  : Factor w/ 3 levels "1","2","3": 1 1 1 2 2 2 3 3 3
##   ..$ col  : Factor w/ 3 levels "1","2","3": 1 2 3 1 2 3 1 2 3
##   ..$ trt  : Factor w/ 3 levels "A","B","C": 2 1 3 3 2 1 1 3 2
```
]

<br>

</div>

---

# `agricolae::design.bib()`

<pre><code>
trt <- c("A", "B", "C")
agricolae::.bg-yellow[design.bib](trt = trt, k = 2) %>% glimpse()
</code></pre>

```
## [1] "No improvement over initial random design."
## 
## Parameters BIB
## ==============
## Lambda     : 1
## treatmeans : 3
## Block size : 2
## Blocks     : 3
## Replication: 2 
## 
## Efficiency factor 0.75 
## 
## <<< Book >>>
```

```
## List of 4
##  $ parameters:List of 6
##   ..$ design: chr "bib"
##   ..$ trt   : chr [1:3] "A" "B" "C"
##   ..$ k     : num 2
##   ..$ serie : num 2
##   ..$ seed  : int -509134623
##   ..$ kinds : chr "Super-Duper"
##  $ statistics:'data.frame':	1 obs. of  6 variables:
##   ..$ lambda    : num 1
##   ..$ treatmeans: int 3
##   ..$ blockSize : num 2
##   ..$ blocks    : int 3
##   ..$ r         : num 2
##   ..$ Efficiency: num 0.75
##  $ sketch    : chr [1:3, 1:2] "C" "B" "B" "A" ...
##  $ book      :'data.frame':	6 obs. of  3 variables:
##   ..$ plots: num [1:6] 101 102 201 202 301 302
##   ..$ block: Factor w/ 3 levels "1","2","3": 1 1 2 2 3 3
##   ..$ trt  : Factor w/ 3 levels "A","B","C": 3 1 2 1 2 3
```
]

<br>

</div>

---

# `agricolae::design.ab()`

.blue[**Factorial design**] for `$t = 3 \times 2$` treatments with `$2$` replication for each treatment

<pre><code>
agricolae::.bg-yellow[design.ab](trt = c(3, 2), r = 2, design = "crd") %>% glimpse()
</code></pre>

```
## List of 2
##  $ parameters:List of 8
##   ..$ design : chr "factorial"
##   ..$ trt    : chr [1:6] "1 1" "1 2" "2 1" "2 2" ...
##   ..$ r      : num [1:6] 2 2 2 2 2 2
##   ..$ serie  : num 2
##   ..$ seed   : int 801301585
##   ..$ kinds  : chr "Super-Duper"
##   ..$        : logi TRUE
##   ..$ applied: chr "crd"
##  $ book      :'data.frame':	12 obs. of  4 variables:
##   ..$ plots: num [1:12] 101 102 103 104 105 106 107 108 109 110 ...
##   ..$ r    : int [1:12] 1 1 1 2 1 1 1 2 2 2 ...
##   ..$ A    : chr [1:12] "2" "3" "2" "2" ...
##   ..$ B    : chr [1:12] "2" "2" "1" "2" ...
```
]

Note *not* A/B testing!

<br>

<br>

</div>

---

# `agricolae::design.split()`

.blue[**Split-plot design**] for `$t = 2 \times 4$` treatments with `$2$` replication for each treatment

<pre><code>
trt1 <- c("I", "R"); trt2 <- LETTERS[1:4]
agricolae::.bg-yellow[design.split](trt1 = trt1, trt2 = trt2, r = 2, design = "crd") %>% 
    glimpse()
</code></pre>

```
## List of 2
##  $ parameters:List of 8
##   ..$ design : chr "split"
##   ..$        : logi TRUE
##   ..$ trt1   : chr [1:2] "I" "R"
##   ..$ applied: chr "crd"
##   ..$ r      : num [1:2] 2 2
##   ..$ serie  : num 2
##   ..$ seed   : int -765020087
##   ..$ kinds  : chr "Super-Duper"
##  $ book      :'data.frame':	16 obs. of  5 variables:
##   ..$ plots : num [1:16] 101 101 101 101 102 102 102 102 103 103 ...
##   ..$ splots: Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
##   ..$ r     : int [1:16] 1 1 1 1 1 1 1 1 2 2 ...
##   ..$ trt1  : chr [1:16] "I" "I" "I" "I" ...
##   ..$ trt2  : chr [1:16] "D" "A" "C" "B" ...
```
]

</div>

---

#   `library(edibble)` <i class="fas fa-wrench"></i> WIP <br>.font_small[https://github.com/emitanaka/edibble<br>(sorry code not ready yet for prime time, please enjoy prototype demo instead)]

* `tibble` R-package is a modern reimagining of the `data.frame`
* `edibble` (WIP) creates experimental design tibbles
{{content}}

--
* 🤔 "named experimental design" functions (`agricolae::design.crd`, etc.) are like "named statistical graphic" functions (`pie`, `barplot`)
{{content}}
--
* 💡 construction of experimental design needs to be made more accessible, modifiable, extensible and generalisable
{{content}}
--
* What are experimental designs composed of?

---

# Grammar of Experimental Design

* Consider a field experiment with 120 plots

]
.item[

```r
library(edibble)
edibble(seed = 2020) %>% 
  set_units(plot = 120)
```

```
## # An edibble: 120 x 1
##       plot
##     <unit>
##  1 plot001
##  2 plot002
##  3 plot003
##  4 plot004
##  5 plot005
##  6 plot006
##  7 plot007
##  8 plot008
##  9 plot009
## 10 plot010
## # … with 110 more rows
```
]
]

---

# Prototype Grammar of Experimental Design

* Consider a field experiment with 120 plots
* There are 60 wheat varieties to test

]
.item50[

```r
library(edibble)
edibble(seed = 2020) %>% 
  set_units(plot = 120) %>% 
  set_trts(var = 60)
```

```
## # An edibble: 0 x 2
## # … with 2 variables: plot <unit>, var <trt>
```

```
## Warning: `plot` and `var` have no connection with other variables.
```

]
]

---

# Prototype Grammar of Experimental Design .circle[1]

* Consider a field experiment with 120 plots.
* There are 60 wheat varieties to test.
* Completely randomise wheat varieties to plots.

Resulting design is a .blue[completely randomised design].

]
.item50[

```r
library(edibble)
edibble(seed = 2020) %>% 
  set_units(plot = 120) %>% 
  set_trts(var = 60) %>% 
  randomise_trts(var ~ plot)
```

```
## # An edibble: 120 x 2
##       plot   var
##     <unit> <trt>
##  1 plot001 var49
##  2 plot002 var28
##  3 plot003 var25
##  4 plot004 var33
##  5 plot005 var36
##  6 plot006 var54
##  7 plot007 var16
##  8 plot008 var52
##  9 plot009 var52
## 10 plot010 var10
## # … with 110 more rows
```

]
]

---

# Prototype Grammar of Experimental Design .circle[2]

* Consider a field experiment with 2 blocks each with 60 plots.
* There are 60 wheat varieties to test.
* Completely randomise wheat varieties to plots within block.

Resulting design is a .blue[randomised complete block design].

<br>

Can you see how it differs from the previous design?

]
.item50[

```r
library(edibble)
edibble(seed = 2020) %>% 
* set_units(block = c("B1", "B2"),
            plot = within(block, 60)) %>% 
  set_trts(var = 60) %>% 
  randomise_trts(var ~ plot) %>% 
* restrict_mapping(ed_nest(plot = block))
```

```
## # An edibble: 120 x 3
##     block    plot   var
##    <unit>  <unit> <trt>
##  1     B1 plot001 var34
##  2     B1 plot002 var56
##  3     B1 plot003 var50
##  4     B1 plot004 var02
##  5     B1 plot005 var07
##  6     B1 plot006 var53
##  7     B1 plot007 var44
##  8     B1 plot008 var31
##  9     B1 plot009 var39
## 10     B1 plot010 var40
## # … with 110 more rows
```

]
]

---

# Visualising the Experimental Design

```r
des1 <- edibble(seed = 2020) %>% 
  set_units(row = 6, 
            col = 6, 
            pot =~ row * col) %>% # 36 pots in total
  set_trts(irrigation = c("Y", "N"),
           variety = c("V1", "V2", "V3")) %>% 
  randomise_trts(irrigation * variety ~ pot)
des2 <- des1 %>% restrict_mapping(ed_nest(pot = c(col, row)))
des3 <- des1 %>% restrict_mapping(irrigation ~ col)
```

* `des1` is a completely randomised design
* `des2` is a Latin Square design
* `des3` is a split-plot design

]
.item[

]
]

.corner-box[
Small touches to help user:
* The graphical object is a `ggplot` so same grammar for `ggplot` can be used to customise it.
* Width-to-height ratio  of figure maintain fixed aspect ratio for easy viewing.
* Factorial experiments: the treatment factor with higher number of levels mapped to the **hue of the color**, and the other treatment factor mapped to the **shade of color**.
]

---

# Unbalanced or non-orthogonal experiments .circle[1]

```r
edibble() %>% 
  set_units(class = c("Maths" = 2, "Stats" = 4)) # 2 Maths class and 4 Stats class

# OR

edibble() %>% 
  set_units(class = traits(labels = c("Maths", "Stats"), replication = c(2, 4)))

# OR

edibble() %>% 
  set_units(class = c("Maths", "Maths", "Stats", "Stats", "Stats", "Stats")) 
```

* Under the hood, the units (and treatments) are all set by `traits`.
* Shorthand inputs for `set_units` and `set_trts` are (a) single integer, (b) unnamed vector, (c) named vector and (d) formula.
* To avoid ambiguity, user can always use `traits` instead, e.g. (a) and (b) may not be distinguishable if vector is of size 1.

---

# Unbalanced or non-orthogonal experiments .circle[2]

```r
edibble() %>% 
  set_units(class = c("A", "B", "C", "D"),
            student = within(class, 
                             "A" ~ 3,  # 3 students in class "A"
                             "B" ~ 4,  # 4 students in class "B"
                               . ~ 2)) # 2 students for rest of the classes
```

What about if students are shared between classes?

```r
edibble() %>% 
  set_units(class = c("A", "B", "C", "D"),
            student = within(class, 
                             "A" ~ c("Bob", "Mary", "Helen"),  
                             "B" ~ c("Helen", "Robert", "Max", "Ana"),  
                             "C" ~ c("Helen", "Max"),
                             "D" ~ c("Max", "Ana"))) 
```

---

# BONUS<br>Design of (made-up) single cell experiments

```r
edibble(seed = 2020) %>% 
  set_units(patient = 48, # 48 patients
            # extract 200-400 cells from each patient
            cell = within(patient, oRanges(min = 200, max = 400)), 
            batch = assemble(cell, 100)) # 100 cells per batch
```
.scroll-350[

```
## # An edibble: 9,600 x 3
##      patient    cell   batch
##       <unit>  <unit>  <unit>
##  1 patient01 cell001  batch1
##  2 patient01 cell002  batch2
##  3 patient01 cell003  batch3
##  4 patient01 cell004  batch4
##  5 patient01 cell005  batch5
##  6 patient01 cell006  batch6
##  7 patient01 cell007  batch7
##  8 patient01 cell008  batch8
##  9 patient01 cell009  batch9
## 10 patient01 cell010 batch10
## # … with 9,590 more rows
```

```
## Warning: `edibble` contructured assuming 200 cells for each patient
```
]

---

# <i class="fas fa-clock"></i> Timeline

.grid[.item[
* By end of 2020, I intend to have `edibble` be able to construct textbook (mostly orthogonal) designs.
* From 2021, I will start to deploy it in practice for plant breeding experiments. 
* Clinical trials, survey designs, adaptive designs, and other designs that require sample size calculation or have undetermined number of units won't be tackled until the foundation has been built with agricultural experiments.

]
.item.center[

# <i class="fas fa-comments"></i>

Feedback is welcomed!

Slides can be found at

<a href="https://www.emitanaka.org/slides/MonashBioinfo2020/" style="font-size:20pt">emitanaka.org/slides/MonashBioinfo2020/</a>

]
]

---

# Acknowledgements

This slide was made using `xaringan` R-package and the following systems.

```r
sessioninfo::session_info()
```

```
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.0.1 (2020-06-06)
##  os       macOS Catalina 10.15.5      
##  system   x86_64, darwin17.0          
##  ui       X11                         
##  language (EN)                        
##  collate  en_AU.UTF-8                 
##  ctype    en_AU.UTF-8                 
##  tz       Australia/Melbourne         
##  date     2020-07-15                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package              * version    date       lib
##  agricolae              1.3-3      2020-06-07 [1]
##  AlgDesign              1.2.0      2019-11-29 [1]
##  anicon                 0.1.0      2020-06-21 [1]
##  assertthat             0.2.1      2019-03-21 [2]
##  backports              1.1.8      2020-06-17 [1]
##  Biobase                2.48.0     2020-04-27 [1]
##  BiocGenerics         * 0.34.0     2020-04-27 [1]
##  BiocParallel           1.22.0     2020-04-27 [1]
##  Biostrings             2.56.0     2020-04-27 [1]
##  bitops                 1.0-6      2013-08-17 [1]
##  blob                   1.2.1      2020-01-20 [2]
##  broom                  0.7.0      2020-07-09 [1]
##  cellranger             1.1.0      2016-07-27 [2]
##  cli                    2.0.2      2020-02-28 [2]
##  cluster                2.1.0      2019-06-19 [2]
##  colorspace             1.4-1      2019-03-18 [2]
##  combinat               0.0-8      2012-10-29 [1]
##  cranlogs               2.1.1      2019-04-29 [1]
##  crayon                 1.3.4      2017-09-16 [2]
##  curl                   4.3        2019-12-02 [2]
##  DBI                    1.1.0      2019-12-15 [2]
##  dbplyr                 1.4.4      2020-05-27 [2]
##  DelayedArray           0.14.0     2020-04-27 [1]
##  digest                 0.6.25     2020-02-23 [2]
##  dplyr                * 1.0.0      2020-05-29 [2]
##  edibble              * 0.0.0.9000 2020-07-15 [1]
##  ellipsis               0.3.1      2020-05-15 [2]
##  emo                    0.0.0.9000 2020-06-26 [1]
##  evaluate               0.14       2019-05-28 [2]
##  fansi                  0.4.1      2020-01-08 [2]
##  farver                 2.0.3      2020-01-16 [2]
##  fastmap                1.0.1      2019-10-08 [2]
##  forcats              * 0.5.0      2020-03-01 [2]
##  foreign                0.8-80     2020-05-24 [1]
##  fs                     1.4.2      2020-06-30 [1]
##  generics               0.0.2      2018-11-29 [2]
##  GenomeInfoDb         * 1.24.2     2020-06-15 [1]
##  GenomeInfoDbData       1.2.3      2020-07-13 [1]
##  GenomicAlignments      1.24.0     2020-04-27 [1]
##  GenomicRanges        * 1.40.0     2020-04-27 [1]
##  ggplot2              * 3.3.2      2020-06-19 [1]
##  glue                   1.4.1      2020-05-13 [2]
##  gtable                 0.3.0      2019-03-25 [2]
##  haven                  2.3.1      2020-06-01 [2]
##  highr                  0.8        2019-03-20 [2]
##  hms                    0.5.3      2020-01-08 [2]
##  htmltools              0.5.0      2020-06-16 [1]
##  httpuv                 1.5.4      2020-06-06 [2]
##  httr                   1.4.1      2019-08-05 [2]
##  icon                   0.1.0      2020-06-21 [1]
##  IRanges              * 2.22.2     2020-05-21 [1]
##  jsonlite               1.7.0      2020-06-25 [1]
##  klaR                   0.6-15     2020-02-19 [1]
##  knitr                  1.29       2020-06-23 [1]
##  labeling               0.3        2014-08-23 [2]
##  labelled               2.5.0      2020-06-17 [1]
##  later                  1.1.0.1    2020-06-05 [2]
##  lattice                0.20-41    2020-04-02 [2]
##  lifecycle              0.2.0      2020-03-06 [1]
##  lubridate              1.7.9      2020-06-08 [2]
##  magrittr               1.5        2014-11-22 [2]
##  MASS                   7.3-51.6   2020-04-26 [2]
##  Matrix                 1.2-18     2019-11-27 [2]
##  matrixStats            0.56.0     2020-03-13 [1]
##  mime                   0.9        2020-02-04 [2]
##  miniUI                 0.1.1.1    2018-05-18 [1]
##  modelr                 0.1.8      2020-05-19 [2]
##  munsell                0.5.0      2018-06-12 [2]
##  nlme                   3.1-148    2020-05-24 [2]
##  pillar                 1.4.6      2020-07-10 [1]
##  pkgconfig              2.0.3      2019-09-22 [2]
##  plyranges            * 1.9.3      2020-07-13 [1]
##  promises               1.1.1      2020-06-09 [1]
##  purrr                * 0.3.4      2020-04-17 [2]
##  questionr              0.7.1      2020-05-26 [1]
##  R6                     2.4.1      2019-11-12 [2]
##  Rcpp                   1.0.5      2020-07-06 [1]
##  RCurl                  1.98-1.2   2020-04-18 [1]
##  readr                * 1.3.1      2018-12-21 [2]
##  readxl                 1.3.1      2019-03-13 [2]
##  reprex                 0.3.0      2019-05-16 [2]
##  rlang                  0.4.7      2020-07-09 [1]
##  rmarkdown              2.3        2020-06-18 [1]
##  Rsamtools              2.4.0      2020-04-27 [1]
##  rstudioapi             0.11       2020-02-07 [2]
##  rtracklayer            1.47.0     2020-04-07 [1]
##  rvest                  0.3.5      2019-11-08 [2]
##  S4Vectors            * 0.26.1     2020-05-16 [1]
##  scales                 1.1.1      2020-05-11 [2]
##  sessioninfo            1.1.1      2018-11-05 [2]
##  shiny                  1.5.0      2020-06-23 [1]
##  stringi                1.4.6      2020-02-17 [2]
##  stringr              * 1.4.0      2019-02-10 [2]
##  SummarizedExperiment   1.18.2     2020-07-09 [1]
##  tibble               * 3.0.3      2020-07-10 [1]
##  tidyr                * 1.1.0      2020-05-20 [2]
##  tidyselect             1.1.0      2020-05-11 [2]
##  tidyverse            * 1.3.0      2019-11-21 [2]
##  utf8                   1.1.4      2018-05-24 [2]
##  vctrs                  0.3.1.9000 2020-07-10 [1]
##  withr                  2.2.0      2020-04-20 [2]
##  xaringan               0.16       2020-03-31 [2]
##  xfun                   0.15       2020-06-21 [1]
##  XML                    3.99-0.4   2020-07-05 [1]
##  xml2                   1.3.2      2020-04-23 [2]
##  xtable                 1.8-4      2019-04-21 [2]
##  XVector                0.28.0     2020-04-27 [1]
##  yaml                   2.2.1      2020-02-01 [1]
##  zeallot                0.1.0      2018-01-28 [1]
##  zlibbioc               1.34.0     2020-04-27 [1]
##  source                           
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  Github (emitanaka/anicon@0b756df)
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  Bioconductor                     
##  Bioconductor                     
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  local                            
##  CRAN (R 4.0.0)                   
##  Github (hadley/emo@3f03b11)      
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  Bioconductor                     
##  Bioconductor                     
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  Github (emitanaka/icon@8458546)  
##  Bioconductor                     
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  Github (sa-lee/plyranges@0cf2e40)
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.1)                   
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Github (r-lib/vctrs@edf507d)     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.1)                   
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
##  CRAN (R 4.0.0)                   
##  CRAN (R 4.0.0)                   
##  Bioconductor                     
## 
## [1] /Users/etan0038/Library/R/4.0/library
## [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
```

]

(Scroll on html slide to see all)