These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. Please note this web version takes a while to load. See here for the PDF .
Press the right arrow to progress to the next slide!
edibble
R-packagePresenter: Emi Tanaka
Department of Econometrics and Business Statistics,
Monash University, Melbourne, Australia
emi.tanaka@monash.edu
9 Nov 2021 @ Applications of Statistical Procedures in Biological Data
Table of Contents
These slides made using R powered by HTML/CSS/JS can be found at
emitanaka.org/slides/stats4bio2021/edibble
1
Essential scientific endeavors to collect data to explore, understand or verify phenomena.
Big picture terminologies
The gold standard in data collection.
Collecting data to compare the effects of different conditions under a controlled environment with the goal of drawing generalisable conclusions
... to identify data-collection schemes that achieve sensitivity and specificity requirements despite biological and technical variability, while keeping time and resource costs low.
— Krzywinski & Altman (2014)
Krzywinski, M., Altman, N. Designing comparative experiments. Nat Methods 11, 597–598 (2014). https://doi.org/10.1038/nmeth.2974
... to identify data-collection schemes that achieve sensitivity and specificity requirements despite biological and technical variability, while keeping time and resource costs low.
— Krzywinski & Altman (2014)
Planning the controlled environment such that there is a higher confidence that effects can be attributed to selected conditions
Krzywinski, M., Altman, N. Designing comparative experiments. Nat Methods 11, 597–598 (2014). https://doi.org/10.1038/nmeth.2974
A treatment (T) is the entire description of the condition applied to an experimental unit.
Experimental unit (Ω) is the smallest unit that the treatment can be independently applied to.
Observational unit (Ωo) is the smallest unit in which the response will be measured on.
Block, also called cluster, is the unit that group some other units (e.g. experimental units) such that the units within the same block (cluster) are more alike (homogeneous).
A design (D: \Omega \rightarrow \mathcal{T}) is the allotment of treatments to particular set of units.
A plan or layout is the design translated into actual units. Randomisation is usually involved in the translation process.
Bailey, R. (2008). Design of Comparative Experiments (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511611483
Unit structure means meaningful ways of dividing up experimental units (\Omega) and observational units (\Omega_o).
For example:
Treatment structure means meaningful ways of dividing up \mathcal{T}.
For example:
Bailey, R. (2008). Design of Comparative Experiments (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511611483
Conclusion: produces most
therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
on average therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
on average therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
The order of the experimental units may be confounded with some extraneous factor
Like say, the order of the experimental units was determined by the speed (fast to slow) of the cow to get to the feed
The order of the experimental units may be confounded with some extraneous factor
Like say, the order of the experimental units was determined by the speed (fast to slow) of the cow to get to the feed
This means that the more active cows are given and leasat active ones are given
Treatment | Replication |
---|---|
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
Treatment factor | Count |
---|---|
![]() |
6 |
![]() |
6 |
![]() |
6 |
![]() |
6 |
Treatment | Replication |
---|---|
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
Treatment factor | Count |
---|---|
![]() |
6 |
![]() |
6 |
![]() |
6 |
![]() |
6 |
A Completely Randomised Design
B Randomised Complete Block Design
C Factorial Design
D Split-Plot Design
2
contains
based on the ctv
package version 0.8.5
contains
based on the ctv
package version 0.8.5
In contrast, only a handful of libraries exist in Python
(namely pyDOE
, pyDOE2
, dexpy
, experimenter
and GPdoemd
).
Thanks to Dewi Lestari Amaliah for the graph!
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|---|---|---|---|
Analysis of Pharmacokinetic Data | 18 | 16.67 | 3.28 | 2.74 |
Hydrological Data and Modeling | 96 | 21.88 | 3.01 | 2.76 |
Design of Experiments (DoE) & Analysis of Experimental Data | 109 | 24.77 | 2.27 | 1.53 |
Chemometrics and Computational Physics | 82 | 29.27 | 2.59 | 2.24 |
Optimization and Mathematical Programming | 135 | 29.63 | 3.00 | 2.31 |
Clinical Trial Design, Monitoring, and Analysis | 60 | 30.00 | 2.70 | 2.10 |
Medical Image Analysis | 23 | 30.43 | 2.74 | 2.18 |
Extreme Value Analysis | 24 | 33.33 | 2.71 | 1.99 |
Statistical Genetics | 22 | 36.36 | 5.86 | 7.78 |
Missing Data | 160 | 37.50 | 4.11 | 9.88 |
Cluster Analysis & Finite Mixture Models | 110 | 39.09 | 3.45 | 3.54 |
Meta-Analysis | 154 | 43.51 | 2.57 | 2.38 |
Official Statistics & Survey Statistics | 132 | 43.94 | 3.39 | 3.61 |
Probability Distributions | 251 | 44.62 | 2.84 | 3.43 |
Processing and Analysis of Tracking Data | 47 | 46.81 | 3.30 | 2.93 |
Bayesian Inference | 138 | 51.45 | 3.27 | 3.08 |
Teaching Statistics | 45 | 53.33 | 5.49 | 7.43 |
Machine Learning & Statistical Learning | 103 | 53.40 | 5.48 | 13.43 |
Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization | 39 | 53.85 | 5.46 | 7.34 |
Empirical Finance | 167 | 55.09 | 3.48 | 5.90 |
Time Series Analysis | 342 | 56.73 | 3.07 | 4.85 |
Differential Equations | 28 | 57.14 | 5.32 | 6.89 |
Functional Data Analysis | 42 | 59.52 | 3.21 | 3.24 |
High-Performance and Parallel Computing with R | 87 | 60.92 | 4.47 | 7.37 |
Numerical Mathematics | 113 | 61.06 | 3.42 | 5.79 |
Natural Language Processing | 54 | 61.11 | 3.31 | 2.60 |
Handling and Analyzing Spatio-Temporal Data | 88 | 63.64 | 3.78 | 4.79 |
Psychometric Models and Methods | 242 | 67.77 | 3.06 | 3.64 |
Analysis of Ecological and Environmental Data | 97 | 71.13 | 4.60 | 5.46 |
Multivariate Statistics | 116 | 71.55 | 4.34 | 4.61 |
Survival Analysis | 251 | 73.31 | 2.84 | 2.49 |
Robust Statistical Methods | 59 | 74.58 | 3.51 | 3.59 |
Reproducible Research | 99 | 74.75 | 5.73 | 11.80 |
Model Deployment with R | 32 | 75.00 | 6.13 | 5.36 |
Databases with R | 38 | 76.32 | 3.58 | 3.33 |
gRaphical Models in R | 32 | 78.13 | 4.16 | 4.14 |
Econometrics | 150 | 78.67 | 3.54 | 3.91 |
Statistics for the Social Sciences | 86 | 79.07 | 4.41 | 4.40 |
Analysis of Spatial Data | 179 | 83.24 | 4.22 | 5.45 |
Web Technologies and Services | 201 | 89.55 | 3.03 | 3.48 |
Phylogenetics, Especially Comparative Methods | 81 | 91.36 | 4.12 | 5.52 |
Thanks to Dewi Lestari Amaliah for the graph!
Authors tend to work in silos limited knowledge sharing across silos perhaps
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|---|---|---|---|
Analysis of Pharmacokinetic Data | 18 | 16.67 | 3.28 | 2.74 |
Hydrological Data and Modeling | 96 | 21.88 | 3.01 | 2.76 |
Design of Experiments (DoE) & Analysis of Experimental Data | 109 | 24.77 | 2.27 | 1.53 |
Chemometrics and Computational Physics | 82 | 29.27 | 2.59 | 2.24 |
Optimization and Mathematical Programming | 135 | 29.63 | 3.00 | 2.31 |
Clinical Trial Design, Monitoring, and Analysis | 60 | 30.00 | 2.70 | 2.10 |
Medical Image Analysis | 23 | 30.43 | 2.74 | 2.18 |
Extreme Value Analysis | 24 | 33.33 | 2.71 | 1.99 |
Statistical Genetics | 22 | 36.36 | 5.86 | 7.78 |
Missing Data | 160 | 37.50 | 4.11 | 9.88 |
Cluster Analysis & Finite Mixture Models | 110 | 39.09 | 3.45 | 3.54 |
Meta-Analysis | 154 | 43.51 | 2.57 | 2.38 |
Official Statistics & Survey Statistics | 132 | 43.94 | 3.39 | 3.61 |
Probability Distributions | 251 | 44.62 | 2.84 | 3.43 |
Processing and Analysis of Tracking Data | 47 | 46.81 | 3.30 | 2.93 |
Bayesian Inference | 138 | 51.45 | 3.27 | 3.08 |
Teaching Statistics | 45 | 53.33 | 5.49 | 7.43 |
Machine Learning & Statistical Learning | 103 | 53.40 | 5.48 | 13.43 |
Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization | 39 | 53.85 | 5.46 | 7.34 |
Empirical Finance | 167 | 55.09 | 3.48 | 5.90 |
Time Series Analysis | 342 | 56.73 | 3.07 | 4.85 |
Differential Equations | 28 | 57.14 | 5.32 | 6.89 |
Functional Data Analysis | 42 | 59.52 | 3.21 | 3.24 |
High-Performance and Parallel Computing with R | 87 | 60.92 | 4.47 | 7.37 |
Numerical Mathematics | 113 | 61.06 | 3.42 | 5.79 |
Natural Language Processing | 54 | 61.11 | 3.31 | 2.60 |
Handling and Analyzing Spatio-Temporal Data | 88 | 63.64 | 3.78 | 4.79 |
Psychometric Models and Methods | 242 | 67.77 | 3.06 | 3.64 |
Analysis of Ecological and Environmental Data | 97 | 71.13 | 4.60 | 5.46 |
Multivariate Statistics | 116 | 71.55 | 4.34 | 4.61 |
Survival Analysis | 251 | 73.31 | 2.84 | 2.49 |
Robust Statistical Methods | 59 | 74.58 | 3.51 | 3.59 |
Reproducible Research | 99 | 74.75 | 5.73 | 11.80 |
Model Deployment with R | 32 | 75.00 | 6.13 | 5.36 |
Databases with R | 38 | 76.32 | 3.58 | 3.33 |
gRaphical Models in R | 32 | 78.13 | 4.16 | 4.14 |
Econometrics | 150 | 78.67 | 3.54 | 3.91 |
Statistics for the Social Sciences | 86 | 79.07 | 4.41 | 4.40 |
Analysis of Spatial Data | 179 | 83.24 | 4.22 | 5.45 |
Web Technologies and Services | 201 | 89.55 | 3.03 | 3.48 |
Phylogenetics, Especially Comparative Methods | 81 | 91.36 | 4.12 | 5.52 |
Top 5 in 2016
Package | Downloads |
---|---|
agricolae | 73,521 |
AlgDesign | 57,037 |
ez | 37,488 |
lhs | 23,518 |
DoE.base | 20,651 |
Top 5 in 2020
Package | Downloads |
---|---|
agricolae | 171,813 |
lhs | 165,415 |
AlgDesign | 153,582 |
DiceKriging | 92,287 |
DiceDesign | 88,160 |
agricolae
is one of the top downloaded
(total download based on logs from the RStudio CRAN mirror scrubbed by Danyang Dai)
agricolae
a case of classical named randomised designs
agricolae::design.crd
Completely randomised design for t = 3 treatments with 2 replicates each
trt <- c("A", "B", "C")agricolae::design.crd(trt = trt, r = 2)
## $parameters## $parameters$design## [1] "crd"## ## $parameters$trt## [1] "A" "B" "C"## ## $parameters$r## [1] 2 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 241038711## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## ## $book## plots r trt## 1 101 1 B## 2 102 1 C## 3 103 1 A## 4 104 2 A## 5 105 2 C## 6 106 2 B
agricolae::design.rcbd
Randomised complete block design for t =3 treatments with 2 blocks
trt <- c("A", "B", "C")agricolae::design.rcbd(trt = trt, r = 2)
## $parameters## $parameters$design## [1] "rcbd"## ## $parameters$trt## [1] "A" "B" "C"## ## $parameters$r## [1] 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 351889891## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## ## $sketch## [,1] [,2] [,3]## [1,] "A" "C" "B" ## [2,] "B" "C" "A" ## ## $book## plots block trt## 1 101 1 A## 2 102 1 C## 3 103 1 B## 4 201 2 B## 5 202 2 C## 6 203 2 A
agricolae::design.ab()
Factorial design for t = 3 \times 2 treatments with 2 replication for each treatment
agricolae::design.ab(trt = c(3, 2), r = 2, design = "crd")
## $parameters## $parameters$design## [1] "factorial"## ## $parameters$trt## [1] "1 1" "1 2" "2 1" "2 2" "3 1" "3 2"## ## $parameters$r## [1] 2 2 2 2 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 34955907## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## $parameters$applied## [1] "crd"## ## ## $book## plots r A B## 1 101 1 2 2## 2 102 1 2 1## 3 103 1 1 2## 4 104 2 2 1## 5 105 1 3 1## 6 106 2 2 2## 7 107 1 1 1## 8 108 2 3 1## 9 109 2 1 2## 10 110 2 1 1## 11 111 1 3 2## 12 112 2 3 2
Note not A/B testing!
agricolae::design.split()
Split-plot design for t = 2 \times 4 treatments with 2 replication for each treatment
trt1 <- c("I", "R"); trt2 <- LETTERS[1:4]agricolae::design.split(trt1 = trt1, trt2 = trt2, r = 2, design = "crd")
## $parameters## $parameters$design## [1] "split"## ## $parameters[[2]]## [1] TRUE## ## $parameters$trt1## [1] "I" "R"## ## $parameters$applied## [1] "crd"## ## $parameters$r## [1] 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] -1981495681## ## $parameters$kinds## [1] "Super-Duper"## ## ## $book## plots splots r trt1 trt2## 1 101 1 1 R D## 2 101 2 1 R C## 3 101 3 1 R B## 4 101 4 1 R A## 5 102 1 1 I A## 6 102 2 1 I C## 7 102 3 1 I D## 8 102 4 1 I B## 9 103 1 2 R B## 10 103 2 2 R C## 11 103 3 2 R A## 12 103 4 2 R D## 13 104 1 2 I C## 14 104 2 2 I A## 15 104 3 2 I B## 16 104 4 2 I D
Good design considers units and treatments first, and then allocates treatments to units. It does not choose from a menu of named designs.
—Rosemary Bailey (2008)
AlgDesign
a case of optimised (model-based) designs
AlgDesign::gen.factorial()
dat <- AlgDesign::gen.factorial(levels = 3, nVars = 3, center = FALSE, varNames = c("irrigation", "fertilizer", "variety"), factors = "all")dat
## irrigation fertilizer variety## 1 1 1 1## 2 2 1 1## 3 3 1 1## 4 1 2 1## 5 2 2 1## 6 3 2 1## 7 1 3 1## 8 2 3 1## 9 3 3 1## 10 1 1 2## 11 2 1 2## 12 3 1 2## 13 1 2 2## 14 2 2 2## 15 3 2 2## 16 1 3 2## 17 2 3 2## 18 3 3 2## 19 1 1 3## 20 2 1 3## 21 3 1 3## 22 1 2 3## 23 2 2 3## 24 3 2 3## 25 1 3 3## 26 2 3 3## 27 3 3 3
AlgDesign::optFederov
AlgDesign::optFederov(frml = ~ ., # assume additive effects data = dat, nTrials = 14, criterion = "D")
## $D## [1] 0.2343815## ## $A## [1] 6.25## ## $Ge## [1] 0.727## ## $Dea## [1] 0.687## ## $design## irrigation fertilizer variety## 2 2 1 1## 3 3 1 1## 4 1 2 1## 5 2 2 1## 7 1 3 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 15 3 2 2## 17 2 3 2## 19 1 1 3## 23 2 2 3## 24 3 2 3## 25 1 3 3## ## $rows## [1] 2 3 4 5 7 9 10 13 15 17 19 23 24 25
AlgDesign::optBlock()
AlgDesign::optBlock(frml = ~ ., # assume additive effects withinData = dat, blocksizes = rep(9, 3), criterion = "D")
## $D## [1] 0.1924501## ## $diagonality## [1] 0.866## ## $Blocks## $Blocks$B1## irrigation fertilizer variety## 2 2 1 1## 5 2 2 1## 8 2 3 1## 12 3 1 2## 15 3 2 2## 16 1 3 2## 21 3 1 3## 22 1 2 3## 25 1 3 3## ## $Blocks$B2## irrigation fertilizer variety## 1 1 1 1## 6 3 2 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 18 3 3 2## 20 2 1 3## 23 2 2 3## 26 2 3 3## ## $Blocks$B3## irrigation fertilizer variety## 3 3 1 1## 4 1 2 1## 7 1 3 1## 11 2 1 2## 14 2 2 2## 17 2 3 2## 19 1 1 3## 24 3 2 3## 27 3 3 3## ## ## $design## irrigation fertilizer variety## 2 2 1 1## 5 2 2 1## 8 2 3 1## 12 3 1 2## 15 3 2 2## 16 1 3 2## 21 3 1 3## 22 1 2 3## 25 1 3 3## 1 1 1 1## 6 3 2 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 18 3 3 2## 20 2 1 3## 23 2 2 3## 26 2 3 3## 3 3 1 1## 4 1 2 1## 7 1 3 1## 11 2 1 2## 14 2 2 2## 17 2 3 2## 19 1 1 3## 24 3 2 3## 27 3 3 3## ## $rows## [1] 2 5 8 12 15 16 21 22 25 1 6 9 10 13 18 20 23 26 3 4 7 11 14 17 19 24 27
What were the experiments about?
What were the experiments about?
What were the experiments about?
Units and allocation are often implicitly understood
3
Computational reproducibility
Allows greater flexibility
Computational reproducibility
Allows greater flexibility
Can promote higher order thinking
Computational reproducibility
Allows greater flexibility
Can promote higher order thinking
if the software is designed with the user in mind
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))# mouthgrid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))# mouthgrid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face2()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face2()
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face2()
face1()
face1()
face1()
face2()
face3()
?
face1()
face2()
face3()
?
Alternative function names:
face_happy()
face_sad()
face_angry()
Now what do you expect for the output?
face1()
face2()
face3()
?
Alternative function names:
face_happy()
face_sad()
face_angry()
Now what do you expect for the output?
What if you want to draw a face that is winking?
What if you want to draw a face that is winking?
... with a grin?
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
The differences between facial features are small, but you need an entire new function that contains instructions for the whole face and a new function name.
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
The differences between facial features are small, but you need an entire new function that contains instructions for the whole face and a new function name.
How would you design the system to draw faces?
https://github.com/emitanaka/portrait
library(portrait)
https://github.com/emitanaka/portrait
library(portrait)
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
But what about hair, nose and other facial features?
face(eyes = "googly", mouth = "smile", shape = "round", hair = "none", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "mohawk", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "none", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "mohawk", nose = "simple")
But about other facial features?
library(portrait)face()
library(portrait)face() + cat_shape()
library(portrait)face() + cat_shape() + cat_eyes()
library(portrait)face() + cat_shape() + cat_eyes() + cat_nose()
library(portrait)face() + cat_shape() + cat_eyes() + cat_nose() + cat_whiskers()
library(portrait)face() + dog_shape() + cat_eyes(fill = "red") + cat_nose() + cat_whiskers()
library(portrait)face() + dog_shape() + cat_eyes(fill = "red") + cat_nose() + cat_whiskers(size = 6, color = "brown") + sketch_mouth(smile = 0.3, size = 3)
The tool you choose to use can enforce a certain way of thinking and may restrict you on what you can do.
4
edibble
edibble
library(edibble)start_design("My experiment")
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4) %>% set_units(subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B"))
edibble
library(edibble)start_design("My experiment") %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_trts(water = c("irrigated", "rainfed")) %>% set_units(wholeplot = 4) %>% set_trts(fertilizer = c("A", "B")) %>% set_units(subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot) %>% assign_trts("random", seed = 1)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% serve_table()
edibble
out <- start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% expect_rcrds(yield = to_be_numeric(with_value(">=", 0)), disease = to_be_factor(levels = c("none", "moderate", "severe"))) %>% serve_table()
edibble
out <- start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% expect_rcrds(yield = to_be_numeric(with_value(">=", 0)), disease = to_be_factor(levels = c("none", "moderate", "severe"))) %>% serve_table()
export_design(out, file = "design-layout.xlsx", overwrite = TRUE)
There are more (not-well documented) features in edibble
There are more (not-well documented) features in edibble
More on those on Thursday!
edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020. The idea for edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020.
Since its initial public realease, underlying structure in edibble
has evolved drastically for the better
The idea for edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020.
Since its initial public realease, underlying structure in edibble
has evolved drastically for the better
The development of a good tool is a community effort so...
edibble
is to help you plan experiments betteredibble
gets better with feedbackedibble
? Submit or upvote here: github.com/emitanaka/edibble/issues, send me an email or tell me now!edibble
is a community effortedibble
R-packagePresenter: Emi Tanaka
Department of Econometrics and Business Statistics,
Monash University, Melbourne, Australia
emi.tanaka@monash.edu
9 Nov 2021 @ Applications of Statistical Procedures in Biological Data
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. Please note this web version takes a while to load. See here for the PDF .
Press the right arrow to progress to the next slide!
edibble
R-packagePresenter: Emi Tanaka
Department of Econometrics and Business Statistics,
Monash University, Melbourne, Australia
emi.tanaka@monash.edu
9 Nov 2021 @ Applications of Statistical Procedures in Biological Data
Table of Contents
These slides made using R powered by HTML/CSS/JS can be found at
emitanaka.org/slides/stats4bio2021/edibble
1
Essential scientific endeavors to collect data to explore, understand or verify phenomena.
Big picture terminologies
The gold standard in data collection.
Collecting data to compare the effects of different conditions under a controlled environment with the goal of drawing generalisable conclusions
... to identify data-collection schemes that achieve sensitivity and specificity requirements despite biological and technical variability, while keeping time and resource costs low.
— Krzywinski & Altman (2014)
Krzywinski, M., Altman, N. Designing comparative experiments. Nat Methods 11, 597–598 (2014). https://doi.org/10.1038/nmeth.2974
... to identify data-collection schemes that achieve sensitivity and specificity requirements despite biological and technical variability, while keeping time and resource costs low.
— Krzywinski & Altman (2014)
Planning the controlled environment such that there is a higher confidence that effects can be attributed to selected conditions
Krzywinski, M., Altman, N. Designing comparative experiments. Nat Methods 11, 597–598 (2014). https://doi.org/10.1038/nmeth.2974
A treatment (\mathcal{T}) is the entire description of the condition applied to an experimental unit.
Experimental unit (\Omega) is the smallest unit that the treatment can be independently applied to.
Observational unit (\Omega_o) is the smallest unit in which the response will be measured on.
Block, also called cluster, is the unit that group some other units (e.g. experimental units) such that the units within the same block (cluster) are more alike (homogeneous).
A design (D: \Omega \rightarrow \mathcal{T}) is the allotment of treatments to particular set of units.
A plan or layout is the design translated into actual units. Randomisation is usually involved in the translation process.
Bailey, R. (2008). Design of Comparative Experiments (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511611483
Unit structure means meaningful ways of dividing up experimental units (\Omega) and observational units (\Omega_o).
For example:
Treatment structure means meaningful ways of dividing up \mathcal{T}.
For example:
Bailey, R. (2008). Design of Comparative Experiments (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511611483
Conclusion: produces most
therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
on average therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Conclusion: produces most
on average therefore
is the most effective supplement for higher milk yield from cows out of the three supplements tested
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
Experimental units are the 3 pens.
Meaning there is only one replication of each treatment.
The order of the experimental units may be confounded with some extraneous factor
Like say, the order of the experimental units was determined by the speed (fast to slow) of the cow to get to the feed
The order of the experimental units may be confounded with some extraneous factor
Like say, the order of the experimental units was determined by the speed (fast to slow) of the cow to get to the feed
This means that the more active cows are given and leasat active ones are given
Treatment | Replication |
---|---|
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
Treatment factor | Count |
---|---|
![]() |
6 |
![]() |
6 |
![]() |
6 |
![]() |
6 |
Treatment | Replication |
---|---|
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
![]() ![]() |
3 |
Treatment factor | Count |
---|---|
![]() |
6 |
![]() |
6 |
![]() |
6 |
![]() |
6 |
A Completely Randomised Design
B Randomised Complete Block Design
C Factorial Design
D Split-Plot Design
2
contains
based on the ctv
package version 0.8.5
contains
based on the ctv
package version 0.8.5
In contrast, only a handful of libraries exist in Python
(namely pyDOE
, pyDOE2
, dexpy
, experimenter
and GPdoemd
).
Thanks to Dewi Lestari Amaliah for the graph!
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|---|---|---|---|
Analysis of Pharmacokinetic Data | 18 | 16.67 | 3.28 | 2.74 |
Hydrological Data and Modeling | 96 | 21.88 | 3.01 | 2.76 |
Design of Experiments (DoE) & Analysis of Experimental Data | 109 | 24.77 | 2.27 | 1.53 |
Chemometrics and Computational Physics | 82 | 29.27 | 2.59 | 2.24 |
Optimization and Mathematical Programming | 135 | 29.63 | 3.00 | 2.31 |
Clinical Trial Design, Monitoring, and Analysis | 60 | 30.00 | 2.70 | 2.10 |
Medical Image Analysis | 23 | 30.43 | 2.74 | 2.18 |
Extreme Value Analysis | 24 | 33.33 | 2.71 | 1.99 |
Statistical Genetics | 22 | 36.36 | 5.86 | 7.78 |
Missing Data | 160 | 37.50 | 4.11 | 9.88 |
Cluster Analysis & Finite Mixture Models | 110 | 39.09 | 3.45 | 3.54 |
Meta-Analysis | 154 | 43.51 | 2.57 | 2.38 |
Official Statistics & Survey Statistics | 132 | 43.94 | 3.39 | 3.61 |
Probability Distributions | 251 | 44.62 | 2.84 | 3.43 |
Processing and Analysis of Tracking Data | 47 | 46.81 | 3.30 | 2.93 |
Bayesian Inference | 138 | 51.45 | 3.27 | 3.08 |
Teaching Statistics | 45 | 53.33 | 5.49 | 7.43 |
Machine Learning & Statistical Learning | 103 | 53.40 | 5.48 | 13.43 |
Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization | 39 | 53.85 | 5.46 | 7.34 |
Empirical Finance | 167 | 55.09 | 3.48 | 5.90 |
Time Series Analysis | 342 | 56.73 | 3.07 | 4.85 |
Differential Equations | 28 | 57.14 | 5.32 | 6.89 |
Functional Data Analysis | 42 | 59.52 | 3.21 | 3.24 |
High-Performance and Parallel Computing with R | 87 | 60.92 | 4.47 | 7.37 |
Numerical Mathematics | 113 | 61.06 | 3.42 | 5.79 |
Natural Language Processing | 54 | 61.11 | 3.31 | 2.60 |
Handling and Analyzing Spatio-Temporal Data | 88 | 63.64 | 3.78 | 4.79 |
Psychometric Models and Methods | 242 | 67.77 | 3.06 | 3.64 |
Analysis of Ecological and Environmental Data | 97 | 71.13 | 4.60 | 5.46 |
Multivariate Statistics | 116 | 71.55 | 4.34 | 4.61 |
Survival Analysis | 251 | 73.31 | 2.84 | 2.49 |
Robust Statistical Methods | 59 | 74.58 | 3.51 | 3.59 |
Reproducible Research | 99 | 74.75 | 5.73 | 11.80 |
Model Deployment with R | 32 | 75.00 | 6.13 | 5.36 |
Databases with R | 38 | 76.32 | 3.58 | 3.33 |
gRaphical Models in R | 32 | 78.13 | 4.16 | 4.14 |
Econometrics | 150 | 78.67 | 3.54 | 3.91 |
Statistics for the Social Sciences | 86 | 79.07 | 4.41 | 4.40 |
Analysis of Spatial Data | 179 | 83.24 | 4.22 | 5.45 |
Web Technologies and Services | 201 | 89.55 | 3.03 | 3.48 |
Phylogenetics, Especially Comparative Methods | 81 | 91.36 | 4.12 | 5.52 |
Thanks to Dewi Lestari Amaliah for the graph!
Authors tend to work in silos limited knowledge sharing across silos perhaps
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|
Topic | # of packages | % of packages connected within topic | Average # of authors | Standard dev. # of authors |
---|---|---|---|---|
Analysis of Pharmacokinetic Data | 18 | 16.67 | 3.28 | 2.74 |
Hydrological Data and Modeling | 96 | 21.88 | 3.01 | 2.76 |
Design of Experiments (DoE) & Analysis of Experimental Data | 109 | 24.77 | 2.27 | 1.53 |
Chemometrics and Computational Physics | 82 | 29.27 | 2.59 | 2.24 |
Optimization and Mathematical Programming | 135 | 29.63 | 3.00 | 2.31 |
Clinical Trial Design, Monitoring, and Analysis | 60 | 30.00 | 2.70 | 2.10 |
Medical Image Analysis | 23 | 30.43 | 2.74 | 2.18 |
Extreme Value Analysis | 24 | 33.33 | 2.71 | 1.99 |
Statistical Genetics | 22 | 36.36 | 5.86 | 7.78 |
Missing Data | 160 | 37.50 | 4.11 | 9.88 |
Cluster Analysis & Finite Mixture Models | 110 | 39.09 | 3.45 | 3.54 |
Meta-Analysis | 154 | 43.51 | 2.57 | 2.38 |
Official Statistics & Survey Statistics | 132 | 43.94 | 3.39 | 3.61 |
Probability Distributions | 251 | 44.62 | 2.84 | 3.43 |
Processing and Analysis of Tracking Data | 47 | 46.81 | 3.30 | 2.93 |
Bayesian Inference | 138 | 51.45 | 3.27 | 3.08 |
Teaching Statistics | 45 | 53.33 | 5.49 | 7.43 |
Machine Learning & Statistical Learning | 103 | 53.40 | 5.48 | 13.43 |
Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization | 39 | 53.85 | 5.46 | 7.34 |
Empirical Finance | 167 | 55.09 | 3.48 | 5.90 |
Time Series Analysis | 342 | 56.73 | 3.07 | 4.85 |
Differential Equations | 28 | 57.14 | 5.32 | 6.89 |
Functional Data Analysis | 42 | 59.52 | 3.21 | 3.24 |
High-Performance and Parallel Computing with R | 87 | 60.92 | 4.47 | 7.37 |
Numerical Mathematics | 113 | 61.06 | 3.42 | 5.79 |
Natural Language Processing | 54 | 61.11 | 3.31 | 2.60 |
Handling and Analyzing Spatio-Temporal Data | 88 | 63.64 | 3.78 | 4.79 |
Psychometric Models and Methods | 242 | 67.77 | 3.06 | 3.64 |
Analysis of Ecological and Environmental Data | 97 | 71.13 | 4.60 | 5.46 |
Multivariate Statistics | 116 | 71.55 | 4.34 | 4.61 |
Survival Analysis | 251 | 73.31 | 2.84 | 2.49 |
Robust Statistical Methods | 59 | 74.58 | 3.51 | 3.59 |
Reproducible Research | 99 | 74.75 | 5.73 | 11.80 |
Model Deployment with R | 32 | 75.00 | 6.13 | 5.36 |
Databases with R | 38 | 76.32 | 3.58 | 3.33 |
gRaphical Models in R | 32 | 78.13 | 4.16 | 4.14 |
Econometrics | 150 | 78.67 | 3.54 | 3.91 |
Statistics for the Social Sciences | 86 | 79.07 | 4.41 | 4.40 |
Analysis of Spatial Data | 179 | 83.24 | 4.22 | 5.45 |
Web Technologies and Services | 201 | 89.55 | 3.03 | 3.48 |
Phylogenetics, Especially Comparative Methods | 81 | 91.36 | 4.12 | 5.52 |
Top 5 in 2016
Package | Downloads |
---|---|
agricolae | 73,521 |
AlgDesign | 57,037 |
ez | 37,488 |
lhs | 23,518 |
DoE.base | 20,651 |
Top 5 in 2020
Package | Downloads |
---|---|
agricolae | 171,813 |
lhs | 165,415 |
AlgDesign | 153,582 |
DiceKriging | 92,287 |
DiceDesign | 88,160 |
agricolae
is one of the top downloaded
(total download based on logs from the RStudio CRAN mirror scrubbed by Danyang Dai)
agricolae
a case of classical named randomised designs
agricolae::design.crd
Completely randomised design for t = 3 treatments with 2 replicates each
trt <- c("A", "B", "C")agricolae::design.crd(trt = trt, r = 2)
## $parameters## $parameters$design## [1] "crd"## ## $parameters$trt## [1] "A" "B" "C"## ## $parameters$r## [1] 2 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 241038711## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## ## $book## plots r trt## 1 101 1 B## 2 102 1 C## 3 103 1 A## 4 104 2 A## 5 105 2 C## 6 106 2 B
agricolae::design.rcbd
Randomised complete block design for t =3 treatments with 2 blocks
trt <- c("A", "B", "C")agricolae::design.rcbd(trt = trt, r = 2)
## $parameters## $parameters$design## [1] "rcbd"## ## $parameters$trt## [1] "A" "B" "C"## ## $parameters$r## [1] 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 351889891## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## ## $sketch## [,1] [,2] [,3]## [1,] "A" "C" "B" ## [2,] "B" "C" "A" ## ## $book## plots block trt## 1 101 1 A## 2 102 1 C## 3 103 1 B## 4 201 2 B## 5 202 2 C## 6 203 2 A
agricolae::design.ab()
Factorial design for t = 3 \times 2 treatments with 2 replication for each treatment
agricolae::design.ab(trt = c(3, 2), r = 2, design = "crd")
## $parameters## $parameters$design## [1] "factorial"## ## $parameters$trt## [1] "1 1" "1 2" "2 1" "2 2" "3 1" "3 2"## ## $parameters$r## [1] 2 2 2 2 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] 34955907## ## $parameters$kinds## [1] "Super-Duper"## ## $parameters[[7]]## [1] TRUE## ## $parameters$applied## [1] "crd"## ## ## $book## plots r A B## 1 101 1 2 2## 2 102 1 2 1## 3 103 1 1 2## 4 104 2 2 1## 5 105 1 3 1## 6 106 2 2 2## 7 107 1 1 1## 8 108 2 3 1## 9 109 2 1 2## 10 110 2 1 1## 11 111 1 3 2## 12 112 2 3 2
Note not A/B testing!
agricolae::design.split()
Split-plot design for t = 2 \times 4 treatments with 2 replication for each treatment
trt1 <- c("I", "R"); trt2 <- LETTERS[1:4]agricolae::design.split(trt1 = trt1, trt2 = trt2, r = 2, design = "crd")
## $parameters## $parameters$design## [1] "split"## ## $parameters[[2]]## [1] TRUE## ## $parameters$trt1## [1] "I" "R"## ## $parameters$applied## [1] "crd"## ## $parameters$r## [1] 2 2## ## $parameters$serie## [1] 2## ## $parameters$seed## [1] -1981495681## ## $parameters$kinds## [1] "Super-Duper"## ## ## $book## plots splots r trt1 trt2## 1 101 1 1 R D## 2 101 2 1 R C## 3 101 3 1 R B## 4 101 4 1 R A## 5 102 1 1 I A## 6 102 2 1 I C## 7 102 3 1 I D## 8 102 4 1 I B## 9 103 1 2 R B## 10 103 2 2 R C## 11 103 3 2 R A## 12 103 4 2 R D## 13 104 1 2 I C## 14 104 2 2 I A## 15 104 3 2 I B## 16 104 4 2 I D
Good design considers units and treatments first, and then allocates treatments to units. It does not choose from a menu of named designs.
—Rosemary Bailey (2008)
AlgDesign
a case of optimised (model-based) designs
AlgDesign::gen.factorial()
dat <- AlgDesign::gen.factorial(levels = 3, nVars = 3, center = FALSE, varNames = c("irrigation", "fertilizer", "variety"), factors = "all")dat
## irrigation fertilizer variety## 1 1 1 1## 2 2 1 1## 3 3 1 1## 4 1 2 1## 5 2 2 1## 6 3 2 1## 7 1 3 1## 8 2 3 1## 9 3 3 1## 10 1 1 2## 11 2 1 2## 12 3 1 2## 13 1 2 2## 14 2 2 2## 15 3 2 2## 16 1 3 2## 17 2 3 2## 18 3 3 2## 19 1 1 3## 20 2 1 3## 21 3 1 3## 22 1 2 3## 23 2 2 3## 24 3 2 3## 25 1 3 3## 26 2 3 3## 27 3 3 3
AlgDesign::optFederov
AlgDesign::optFederov(frml = ~ ., # assume additive effects data = dat, nTrials = 14, criterion = "D")
## $D## [1] 0.2343815## ## $A## [1] 6.25## ## $Ge## [1] 0.727## ## $Dea## [1] 0.687## ## $design## irrigation fertilizer variety## 2 2 1 1## 3 3 1 1## 4 1 2 1## 5 2 2 1## 7 1 3 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 15 3 2 2## 17 2 3 2## 19 1 1 3## 23 2 2 3## 24 3 2 3## 25 1 3 3## ## $rows## [1] 2 3 4 5 7 9 10 13 15 17 19 23 24 25
AlgDesign::optBlock()
AlgDesign::optBlock(frml = ~ ., # assume additive effects withinData = dat, blocksizes = rep(9, 3), criterion = "D")
## $D## [1] 0.1924501## ## $diagonality## [1] 0.866## ## $Blocks## $Blocks$B1## irrigation fertilizer variety## 2 2 1 1## 5 2 2 1## 8 2 3 1## 12 3 1 2## 15 3 2 2## 16 1 3 2## 21 3 1 3## 22 1 2 3## 25 1 3 3## ## $Blocks$B2## irrigation fertilizer variety## 1 1 1 1## 6 3 2 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 18 3 3 2## 20 2 1 3## 23 2 2 3## 26 2 3 3## ## $Blocks$B3## irrigation fertilizer variety## 3 3 1 1## 4 1 2 1## 7 1 3 1## 11 2 1 2## 14 2 2 2## 17 2 3 2## 19 1 1 3## 24 3 2 3## 27 3 3 3## ## ## $design## irrigation fertilizer variety## 2 2 1 1## 5 2 2 1## 8 2 3 1## 12 3 1 2## 15 3 2 2## 16 1 3 2## 21 3 1 3## 22 1 2 3## 25 1 3 3## 1 1 1 1## 6 3 2 1## 9 3 3 1## 10 1 1 2## 13 1 2 2## 18 3 3 2## 20 2 1 3## 23 2 2 3## 26 2 3 3## 3 3 1 1## 4 1 2 1## 7 1 3 1## 11 2 1 2## 14 2 2 2## 17 2 3 2## 19 1 1 3## 24 3 2 3## 27 3 3 3## ## $rows## [1] 2 5 8 12 15 16 21 22 25 1 6 9 10 13 18 20 23 26 3 4 7 11 14 17 19 24 27
What were the experiments about?
What were the experiments about?
What were the experiments about?
Units and allocation are often implicitly understood
3
Computational reproducibility
Allows greater flexibility
Computational reproducibility
Allows greater flexibility
Can promote higher order thinking
Computational reproducibility
Allows greater flexibility
Can promote higher order thinking
if the software is designed with the user in mind
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))# mouthgrid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)
library(grid)# face shapegrid.circle(x = 0.5, y = 0.5, r = 0.5)# eyesgrid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black"))# mouthgrid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face2()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face2()
face1()
face1 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE)}face2 <- function() { grid::grid.circle(x = 0.5, y = 0.5, r = 0.5) grid::grid.circle(x = c(0.35, 0.65), y = c(0.6, 0.6), r = 0.05, gp = gpar(fill = "black")) grid::grid.curve(x1 = 0.4, y1 = 0.4, x2 = 0.6, y2 = 0.4, square = FALSE, curvature = -1)}
face1()
face1()
face2()
face2()
face1()
face1()
face1()
face2()
face3()
?
face1()
face2()
face3()
?
Alternative function names:
face_happy()
face_sad()
face_angry()
Now what do you expect for the output?
face1()
face2()
face3()
?
Alternative function names:
face_happy()
face_sad()
face_angry()
Now what do you expect for the output?
What if you want to draw a face that is winking?
What if you want to draw a face that is winking?
... with a grin?
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
The differences between facial features are small, but you need an entire new function that contains instructions for the whole face and a new function name.
What if you want to draw a face that is winking?
... with a grin?
... or with the tongue out?
The differences between facial features are small, but you need an entire new function that contains instructions for the whole face and a new function name.
How would you design the system to draw faces?
https://github.com/emitanaka/portrait
library(portrait)
https://github.com/emitanaka/portrait
library(portrait)
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
https://github.com/emitanaka/portrait
library(portrait)
face(eyes = "googly", mouth = "smile", shape = "round")
face(eyes = "round", mouth = "sad", shape = "oval")
But what about hair, nose and other facial features?
face(eyes = "googly", mouth = "smile", shape = "round", hair = "none", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "mohawk", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "none", nose = "simple")
face(eyes = "googly", mouth = "smile", shape = "round", hair = "mohawk", nose = "simple")
But about other facial features?
library(portrait)face()
library(portrait)face() + cat_shape()
library(portrait)face() + cat_shape() + cat_eyes()
library(portrait)face() + cat_shape() + cat_eyes() + cat_nose()
library(portrait)face() + cat_shape() + cat_eyes() + cat_nose() + cat_whiskers()
library(portrait)face() + dog_shape() + cat_eyes(fill = "red") + cat_nose() + cat_whiskers()
library(portrait)face() + dog_shape() + cat_eyes(fill = "red") + cat_nose() + cat_whiskers(size = 6, color = "brown") + sketch_mouth(smile = 0.3, size = 3)
The tool you choose to use can enforce a certain way of thinking and may restrict you on what you can do.
4
edibble
edibble
library(edibble)start_design("My experiment")
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4) %>% set_units(subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B"))
edibble
library(edibble)start_design("My experiment") %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_trts(water = c("irrigated", "rainfed")) %>% set_units(wholeplot = 4) %>% set_trts(fertilizer = c("A", "B")) %>% set_units(subplot = nested_in(wholeplot, 2))
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot) %>% assign_trts("random", seed = 1)
edibble
library(edibble)start_design("My experiment") %>% set_units(wholeplot = 4, subplot = nested_in(wholeplot, 2)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water ~ wholeplot, fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% serve_table()
edibble
library(edibble)start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% serve_table()
edibble
out <- start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% expect_rcrds(yield = to_be_numeric(with_value(">=", 0)), disease = to_be_factor(levels = c("none", "moderate", "severe"))) %>% serve_table()
edibble
out <- start_design("Modified design") %>% set_units(block = 2, subplot = nested_in(block, 4)) %>% set_trts(water = c("irrigated", "rainfed"), fertilizer = c("A", "B")) %>% allot_trts(water:fertilizer ~ subplot) %>% assign_trts("random", seed = 1) %>% set_rcrds_of(subplot = c("yield", "disease"), block = "manager") %>% expect_rcrds(yield = to_be_numeric(with_value(">=", 0)), disease = to_be_factor(levels = c("none", "moderate", "severe"))) %>% serve_table()
export_design(out, file = "design-layout.xlsx", overwrite = TRUE)
There are more (not-well documented) features in edibble
There are more (not-well documented) features in edibble
More on those on Thursday!
edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020. The idea for edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020.
Since its initial public realease, underlying structure in edibble
has evolved drastically for the better
The idea for edibble
was conceived early 2019, the code base was released publicly on 31st Dec 2020.
Since its initial public realease, underlying structure in edibble
has evolved drastically for the better
The development of a good tool is a community effort so...
edibble
is to help you plan experiments betteredibble
gets better with feedbackedibble
? Submit or upvote here: github.com/emitanaka/edibble/issues, send me an email or tell me now!edibble
is a community effort