Current state and prospects of R-packages for the design of experiments

👩🏻‍💻 Dr. Emi Tanaka

emi.tanaka@monash.edu
@statsgen on Twitter
emitanaka on GitHub
emitanaka.org

29th June 2022

Statistical Society of Australia Canberra Branch

--pink: #d495b1; --blue: #69aee5; --border-color: rgba(27, 31, 35, 1); --color: @p(var(--pink), var(--blue)); :doodle { @grid: 5 / 40vmin; overflow: hidden; } margin: 20%; border: 3px solid var(--border-color); border-radius: @repeat(4, @rand(45%, 55%)); background: @pick( none, linear-gradient(var(--color), @lp()), linear-gradient(var(--pink) 47%, var(--border-color) 47%, var(--border-color) 53%, var(--blue) 53%), linear-gradient(var(--pink) 29%, var(--border-color) 29%, var(--border-color) 35%, var(--blue) 35%, var(--blue) 64%, var(--border-color) 64%, var(--border-color) 70%, var(--pink) 70%), linear-gradient(var(--blue), var(--blue) 29%, var(--border-color) 29%, var(--border-color) 35%, var(--pink) 35%, var(--pink) 64%, var(--border-color) 64%, var(--border-color) 70%, var(--blue) 70%, var(--blue) 100%), linear-gradient(45deg, var(--pink) 48%, var(--border-color) 48%, var(--border-color) 52%, var(--blue) 52%), @m(3, radial-gradient(circle at center, @pick-n(var(--pink), var(--border-color), var(--blue)) @pick-n(40%, 48%, 100%), rgba(0,0,0,0) @last-pick())), @m(3, radial-gradient(circle at center, @pick-n(var(--blue), var(--border-color), var(--pink)) @pick-n(40%, 48%, 100%), rgba(0,0,0,0) @last-pick())), @m(2, linear-gradient(@pick-n(45deg, -45deg), var(--pink) 48%, var(--border-color) 48%, var(--border-color) 52%, var(--blue) 52%)), conic-gradient(from 90deg, transparent 12.5%, var(--pink) 12.5%, var(--pink) 37.5%, transparent 37.5%, transparent 62.5%, var(--blue) 62.5%, var(--blue) 87.5%, transparent 87.5%)); background-blend-mode: @p(color-burn, color-dodge, darken, hard-light, overlay, screen); transition: .1s ease all; animation: rotation 30s linear infinite both; @random { animation-direction: reverse; } @keyframes rotation { 0% { transform: rotate(0deg) } 100% { transform: rotate(360deg) } }

Take-away messages

Slides are available at emitanaka.org/slides/ssacanberra2022/
Our paper is available on arXiv:
Tanaka & Amaliah (2022) Current state and prospects of R-packages for the design of experiments. arxiv.org/abs/2206.07532
The edibble R-package v0.1.0 is on CRAN with the developmental version on GitHub at github.com/emitanaka/edibble

install.packages("edibble")

EXP 1 🌾

MATERIALS AND METHODS

Experimental design

The present study took place in the BAC experiment site established at the Cedar Creek Ecosystem Science Reserve, Minnesota, USA. The site occurs on a glacial outwash plain with sandy soils. Mean temperature during the growing season (April–September) was 15.98°C in 2011 and 17.18°C in 2012. Precipitation during the growing season was 721 mm in 2011. The growing season in 2012 was considerably drier, with 545 mm rainfall.

Experimental plots (9×9 m) were planted in 1994 and 1995 with different plant communities spanning a plant diversity gradient of one, four, and 16 species, which were randomly chosen from the species listed below (Tilman et al. 2001). The grassland prairie species belonged to one of five plant functional groups: C₃ grasses (Agropyron smithii Tydb., Elymus canadensis L., Koeleria cristata (Ledeb.) Schult., Poa pratensis L.), C₄ grasses (Andropogon gerardii Vitman., Panicum virgatum L., Schizachyrium scoparium (Michx.) Nash, Sorghas-trum nutans (L.) Nash), legumes (Amorpha canescens Pursh., Lespedeza capitata Michx., Lupinus perennis L., Petalostemum purpureum (Vent.) Rydb., Petalostemum villosum Spreng.), nonlegume forbs (Achillea millefolium L., Asclepias tuberosa L., Liatris aspera Michx., Monarda fistulosa L., Soldidago rigida L.), and woody species (Quercus ellipsoidalis E. J. Hill, Quercus macro-carpa Michx.). The individuals of those two woody species (Quercus spp.), which were small in size and rare because of low survival, were removed from all plots in which they occurred in 2010.

In addition to the manipulation of plant diversity,the plots were divided into three subplots (2.5×3.0 m). Heat treatments were applied from March to November each year, beginning in 2009, using infrared lamps 1.8 m above ground emitting 600 W (which caused a 1.5°C increase in soil temperature for vegetation-freesoils) and 1200 W (which caused a 3°C increase; Valpine and Harte 2001, Kimball 2005, Whittingtonet al. 2013) to increase the surface soil temperature of each subplot (see Plate 1). To account for possible shading effects, metal flanges and frames were hungover control subplots. An average across all vegetated plots, temperature manipulations elevated soil temperature at 1 cm depth by 1.18°C in the low warming (+1.5°C) treatment and by 2.69°C in the high warming (+3°C) treatment, and at 10 cm depth temperature by 1.00°C in the low warming (+1.5°C) treatment and by 2.16°C in the high warming (+3°C) treatment.

Soil samples of three subplots in each of 27 experimental plots were taken; due to technical difficulties we could only analyze 66 samples out of 81 existing subplots (monoculture, 10 replicates in ambient +0°C treatment, eight replicates in +1.5°C treatment, nine replicates in +3°C treatment; four species mixture, six replicates in ambient +0°C treatment, six replicates in +1.5°C treatment, seven replicates in +3°C treatment; 16 species mixture, six replicates in ambient +0°C treatment, six replicates in +1.5°C treatment, eight replicates in +3°C treatment). The BAC plots are a representative subset of the plots in the biodiversity experiment E120 at Cedar Creek, which were assembled as random draws of a given number of species from the species pool (Zak et al. 2003). Given low heterogeneity of soil abiotic conditions at the start of the experiment, the experiment was not blocked.

EXP 1 🌾

MATERIALS AND METHODS

Experimental design (condensed version)

Experimental plots were planted with different plant communities spanning a plant diversity gradient of one, four, and 16 species, which were randomly chosen from the species listed (5 plant functional groups – 19 species in total)
Plots were divided into three subplots
Heat treatments were applied to subplots emitting 600 W which caused a 1.5°C increase in soil temperature for vegetation-free soils) and 1200 W (which caused a 3°C increase) (control with 0°C included)
Soil samples of three subplots in each of 27 experimental plots were taken
Given low heterogeneity of soil abiotic conditions at the start of the experiment, the experiment was not blocked.

This is in fact a split-plot design!

library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1library(edibble)
des1 <- design("Steinauer et al. 2015") %>% 
  set_units(plot = 27,
            subplot = nested_in(plot, 3)) %>% 
  set_trts(nspecies = c(1, 4, 16),
           temperature = c(0, 1.5, 3)) %>% 
  allot_trts(   nspecies ~ plot,
             temperature ~ subplot) %>% 
  assign_trts("random") %>% 
  serve_table()
des1

# Steinauer et al. 2015 
# An edibble: 81 x 4
         plot    subplot nspecies temperature
   <unit(27)> <unit(81)> <trt(3)>    <trt(3)>
 1      plot1  subplot1        4          0  
 2      plot1  subplot2        4          3  
 3      plot1  subplot3        4          1.5
 4      plot2  subplot4        1          0  
 5      plot2  subplot5        1          1.5
 6      plot2  subplot6        1          3  
 7      plot3  subplot7        16         1.5
 8      plot3  subplot8        16         0  
 9      plot3  subplot9        16         3  
10      plot4  subplot10       16         1.5
# … with 71 more rows

Downstream Benefit #3 Set record

In edibble, records are intended variables, e.g. responses, that will be measured or observed
You can set expectations of the record (plausible values) and simulate records, censoring values (default as missing) outside of expectations, or export data with data validation

library(simulate) # remotes::install_github("emitanaka/simulate")
des1 %>% 
  set_rcrds(microbial_biomass = subplot) %>% 
  expect_rcrds(microbial_biomass >= 0) %>% 
  simulate_rcrds(microbial_biomass = sim_normal(mean = 0.05, sd = 0.5)) # dev onlylibrary(simulate) # remotes::install_github("emitanaka/simulate")
des1 %>% 
  set_rcrds(microbial_biomass = subplot) %>% 
  expect_rcrds(microbial_biomass >= 0) %>% 
  simulate_rcrds(microbial_biomass = sim_normal(mean = 0.05, sd = 0.5)) # dev onlylibrary(simulate) # remotes::install_github("emitanaka/simulate")
des1 %>% 
  set_rcrds(microbial_biomass = subplot) %>% 
  expect_rcrds(microbial_biomass >= 0) %>% 
  simulate_rcrds(microbial_biomass = sim_normal(mean = 0.05, sd = 0.5)) # dev onlylibrary(simulate) # remotes::install_github("emitanaka/simulate")
des1 %>% 
  set_rcrds(microbial_biomass = subplot) %>% 
  expect_rcrds(microbial_biomass >= 0) %>% 
  simulate_rcrds(microbial_biomass = sim_normal(mean = 0.05, sd = 0.5)) # dev only

# Steinauer et al. 2015 
# An edibble: 81 x 5
         plot    subplot nspecies temperature microbial_biomass
   <unit(27)> <unit(81)> <trt(3)>    <trt(3)>             <dbl>
 1      plot1  subplot1        4          0            NA      
 2      plot1  subplot2        4          3            NA      
 3      plot1  subplot3        4          1.5           0.380  
 4      plot2  subplot4        1          0            NA      
 5      plot2  subplot5        1          1.5           0.334  
 6      plot2  subplot6        1          3            NA      
 7      plot3  subplot7        16         1.5           0.00427
 8      plot3  subplot8        16         0             0.836  
 9      plot3  subplot9        16         3             0.0667 
10      plot4  subplot10       16         1.5           0.742  
# … with 71 more rows

Specifying unbalanced designs

MATERIALS AND METHODS

Experimental design (condensed version)

Experimental plots were planted with different plant communities spanning a plant diversity gradient of one, four, and 16 species, which were randomly chosen from the species listed (5 plant functional groups – 19 species in total)

des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)des2 <- design("Steinauer et al. 2015 Part II") %>% 
  set_units(plot = 27,
            plant = nested_in(plot, 1:9 ~ 1,
                                  10:18 ~ 4,
                                  19:27 ~ 16)) %>% 
  set_trts(species = 19) %>% 
  allot_trts(species ~ plant) %>% 
  assign_trts("random") %>% 
  serve_table()

options(deggust.nfill_max = 3)
autoplot(des2) + facet_wrap(~plot, nrow = 2)

1 / 26

Current state and prospects of R-packages for the design of experiments 👩🏻‍💻 Dr. Emi Tanaka emi.tanaka@monash.edu @statsgen on Twitter emitanaka on GitHub emitanaka.org 29th June 2022 Statistical Society of Australia Canberra Branch --pink: #d495b1; --blue: #69aee5; --border-color: rgba(27, 31, 35, 1); --color: @p(var(--pink), var(--blue)); :doodle { @grid: 5 / 40vmin; overflow: hidden; } margin: 20%; border: 3px solid var(--border-color); border-radius: @repeat(4, @rand(45%, 55%)); background: @pick( none, linear-gradient(var(--color), @lp()), linear-gradient(var(--pink) 47%, var(--border-color) 47%, var(--border-color) 53%, var(--blue) 53%), linear-gradient(var(--pink) 29%, var(--border-color) 29%, var(--border-color) 35%, var(--blue) 35%, var(--blue) 64%, var(--border-color) 64%, var(--border-color) 70%, var(--pink) 70%), linear-gradient(var(--blue), var(--blue) 29%, var(--border-color) 29%, var(--border-color) 35%, var(--pink) 35%, var(--pink) 64%, var(--border-color) 64%, var(--border-color) 70%, var(--blue) 70%, var(--blue) 100%), linear-gradient(45deg, var(--pink) 48%, var(--border-color) 48%, var(--border-color) 52%, var(--blue) 52%), @m(3, radial-gradient(circle at center, @pick-n(var(--pink), var(--border-color), var(--blue)) @pick-n(40%, 48%, 100%), rgba(0,0,0,0) @last-pick())), @m(3, radial-gradient(circle at center, @pick-n(var(--blue), var(--border-color), var(--pink)) @pick-n(40%, 48%, 100%), rgba(0,0,0,0) @last-pick())), @m(2, linear-gradient(@pick-n(45deg, -45deg), var(--pink) 48%, var(--border-color) 48%, var(--border-color) 52%, var(--blue) 52%)), conic-gradient(from 90deg, transparent 12.5%, var(--pink) 12.5%, var(--pink) 37.5%, transparent 37.5%, transparent 62.5%, var(--blue) 62.5%, var(--blue) 87.5%, transparent 87.5%)); background-blend-mode: @p(color-burn, color-dodge, darken, hard-light, overlay, screen); transition: .1s ease all; animation: rotation 30s linear infinite both; @random { animation-direction: reverse; } @keyframes rotation { 0% { transform: rotate(0deg) } 100% { transform: rotate(360deg) } } Hi everyone, this is my first talk that I’m delivering in-person for over two and half years So bear with me as I get used to actually seeing people in front of me as I present

Current state and prospects of R-packages for the design of experiments

Take-away messages

Which vaccines are effective against COVID-19?

What diet works best for lowering insulin?

What diet works best for lowering insulin?

R-packages for the
design of experiments

Why R-packages?

Downloads of ExperimentalDesign packages

Small number of packages dominate downloads

ExperimentalDesign is the least collaborative

`agricolae` package

`AlgDesign` package

EXPERIMENT 1

EXP 1 🌾

EXP 1 🌾

Why `edibble`?

Downstream Benefit #1 Easy-and-quick visualisation

Downstream Benefit #2 Cutomise visualisation

Downstream Benefit #3 Set record

Specifying unbalanced designs

EXPERIMENT 2

EXP 2 🌱

EXP 2 🌱