14  Getting started

šŸ“– Status of the book

Hi there! This book is a work-in-progress. You may like to come back later when itā€™s closer to a complete state. If you would like to raise issues or leave feedback, please feel to do this:

The experimental planning in this book is supported by the edibble R-package and its extensions. The edibble system is based on the grammar of experimental designs in ?sec-grammar. To get started with using the system, you need to first load the package:

library(edibble)

14.1 Setting the experimental structure and context

In the edibble system, an experiment design is built step-by-step. In the first step, you start with initialising the design by using the design() where the user can supply an optional title of the experiment. All subsequent steps are best specified after using the pipe operator (%>%) re-exported from the magrittr package or the native pipe operator (|>) available from R version 4.1 onwards. For example, below we carry out the following steps:

  1. initiate an experiment called ā€œMy first experimentā€ via design();
  2. set 24 pots with set_units(); and then
  3. set treatments with 3 levels: 2 bacterial strains (PM398 and ZNP1) and no innoculation with set_trts().
mydesign1 <- design("My first experiment") %>% 
  set_units(pot = 24) %>% 
  set_trts(innoculation = c("PM398", "ZNP1", "none"))

These steps create and modify a so-called ā€œedibble designā€ (edbl_design) object that is used to represent an intermediate construct of an experimental design. The arguments in the functions set_units() and set_trts() employ the convention that the left hand side (LHS) is the name of the factor and the input on the right hand side (RHS) defines the levels. If the RHS is a single integer then itā€™s assumed to be the number of levels and if itā€™s a vector then itā€™s the name of the levels.

The print out of this object, as seen below, is displayed like a tree summarising the variables defined thus far.

mydesign1
My first experiment
ā”œā”€pot (24 levels)
ā””ā”€innoculation (3 levels)

If you view the print out in the terminal or console, you will actually see that text are color coded so itā€™s easier to distinguish between the type of variables (e.g. unit or treatment). The above print out in a terminal will be viewed like below. These colors are customisable as discussed in Section @ref(aes-custom).

The argument names in set_units and set_trts are not fixed, so you could easily change the name of the units or treatment to something more meaningful. Below we have another experiment which has a statistically equivalent experimental structure as the previous experiment (24 units and 3 levels of treatment), but it ā€œreadsā€ as a different experimental context ā€“ a typical user will realise that the experimental units are pigs and treatments are diet.

mydesign2 <- design("My second experiment") %>% 
  set_units(pig = 24) %>% 
  set_trts(diet = c("high-fat", "low-fat", "standard"))

mydesign2
My second experiment
ā”œā”€pig (24 levels)
ā””ā”€diet (3 levels)

You can define more than one unit factor or treatment factor. For example below, it reads as that there are: 3 pens (named ā€œNorthā€, ā€œShadeā€ and ā€œMeadowā€), 18 pigs, 3 type of diets and 2 types of supplement given at a frequency of either daily or weekly.

design("My experiment with multiple factors") %>% 
  set_units(pen = c("North", "Shade", "Meadow"),
            pig = 18) %>% 
  set_trts(diet = c("high-fat", "low-fat", "standard"),
           supplement = 2, 
           # frequency of supplements:
           frequency = c("daily", "weekly"))
My experiment with multiple factors
ā”œā”€pen (3 levels)
ā”œā”€pig (18 levels)
ā”œā”€diet (3 levels)
ā”œā”€supplement (2 levels)
ā””ā”€frequency (2 levels)

The unit and treatment factors do not need to be defined in one call. The ordering of the functions is commutative where the factor is not dependent directly on another factor.

design("My experiment with multiple factors") %>% 
  set_units(pen = c("North", "Shade", "Meadow")) %>%
  set_trts(diet = c("high-fat", "low-fat", "standard"),
           supplement = 2) %>% 
  # frequency of supplements
  set_trts(frequency = c("daily", "weekly")) %>% 
  set_units(pig = 18) 
My experiment with multiple factors
ā”œā”€pen (3 levels)
ā”œā”€diet (3 levels)
ā”œā”€supplement (2 levels)
ā”œā”€frequency (2 levels)
ā””ā”€pig (18 levels)

14.1.1 A viable experimental design

A viable experimental design is only specified if the relationship between the variables can be reconciled to a single observational unit.

For example below

design("An invalid unit structure") %>% 
  set_units(pen = 6,
            pig = 18) %>% 
  serve_table()
Error in `serve_table()`:
! The graph cannot be converted to a table format.

14.1.2 Understanding the experimental context

14.1.3 Fitting a variety of mental mode

14.2 Mapping treatment to units

The above code creates an object that represents an intermediate construct of an experimental design. To complete the specification of a minimum experimental design, we still need to specify:

  • the mapping of the treatments to units, which is achieved using allot_trts(), and
  • how the treatments are actually allocated to units via assign_trts().

These step may feel redundant in experiments where there is exactly one unit factor and one treatment factor but itā€™s required for further generalisation to other experimental structures. In addition, it serves to re-enforce the specific actions to the user.

des <- design("My first animal experiment") %>% 
  set_units(pig = 24) %>% 
  set_trts(diet = c("high-fat", "low-fat", "standard")) %>% 
  allot_trts(diet ~ pig) %>% 
  assign_trts(order = "random", seed = 1) 

Once a minimum viable experimental design has been specified, then you can generate the experimental design table (or tibble) called edibble by parsing the object to serve_table().

serve_table(des)
# My first animal experiment 
# An edibble: 24 x 2
       pig     diet
   <U(24)>   <T(3)>
     <chr>    <chr>
 1   pig01 high-fat
 2   pig02 low-fat 
 3   pig03 low-fat 
 4   pig04 low-fat 
 5   pig05 standard
 6   pig06 high-fat
 7   pig07 standard
 8   pig08 high-fat
 9   pig09 standard
10   pig10 standard
# ā„¹ 14 more rows

14.2.1 Treatment allotment

diet_design <- function(...) {
  design("A valid nested design") %>% 
    set_units(pen = 6,
              pig = nested_in(pen, 3)) %>% 
    set_trts(diet = c("high-fat", "low-fat", "standard")) %>% 
    allot_trts(...) %>% 
    assign_trts(order = "systematic") %>% 
    serve_table()
}

diet_design(diet ~ pig)
# A valid nested design 
# An edibble: 18 x 3
      pen     pig     diet
   <U(6)> <U(18)>   <T(3)>
    <chr>   <chr>    <chr>
 1   pen1   pig01 high-fat
 2   pen1   pig02 low-fat 
 3   pen1   pig03 standard
 4   pen2   pig04 high-fat
 5   pen2   pig05 low-fat 
 6   pen2   pig06 standard
 7   pen3   pig07 high-fat
 8   pen3   pig08 low-fat 
 9   pen3   pig09 standard
10   pen4   pig10 high-fat
11   pen4   pig11 low-fat 
12   pen4   pig12 standard
13   pen5   pig13 high-fat
14   pen5   pig14 low-fat 
15   pen5   pig15 standard
16   pen6   pig16 high-fat
17   pen6   pig17 low-fat 
18   pen6   pig18 standard
diet_design(diet ~ pen)
# A valid nested design 
# An edibble: 18 x 3
      pen     pig     diet
   <U(6)> <U(18)>   <T(3)>
    <chr>   <chr>    <chr>
 1   pen1   pig01 high-fat
 2   pen1   pig02 high-fat
 3   pen1   pig03 high-fat
 4   pen2   pig04 low-fat 
 5   pen2   pig05 low-fat 
 6   pen2   pig06 low-fat 
 7   pen3   pig07 standard
 8   pen3   pig08 standard
 9   pen3   pig09 standard
10   pen4   pig10 high-fat
11   pen4   pig11 high-fat
12   pen4   pig12 high-fat
13   pen5   pig13 low-fat 
14   pen5   pig14 low-fat 
15   pen5   pig15 low-fat 
16   pen6   pig16 standard
17   pen6   pig17 standard
18   pen6   pig18 standard

14.3 Setting records with expectations

The term ā€œrecordsā€, often abbreviated as rcrd in the edibble system, refers to any observational variables, including response variables. Setting records is a way to describe the intent of the observations that you will collect in the experiment. All records must be made on a unit. The primary way to set the record is using set_rcrds() where the LHS is the factor name of the record and the RHS is the unit factor on which the record will be measured. The LHS of the input in set_rcrds() follow the same pattern as set_units() and set_trts() as the LHS is the name of the new factor. In the example below, we are measuring weight and sex of the pig and recording the manager of each pen. In this example there are only 6 pens, therefore there should be only a maximum of 6 managers recorded.

desr <- diet_design(diet ~ pen) %>% 
  set_rcrds(weight = pig,
            sex = pig,
            manager = pen) 

desr
# A valid nested design 
# An edibble: 18 x 6
      pen     pig     diet  weight     sex manager
   <U(6)> <U(18)>   <T(3)> <R(18)> <R(18)>  <R(6)>
    <chr>   <chr>    <chr>   <dbl>   <dbl>   <dbl>
 1   pen1   pig01 high-fat       o       o       o
 2   pen1   pig02 high-fat       o       o       x
 3   pen1   pig03 high-fat       o       o       x
 4   pen2   pig04 low-fat        o       o       o
 5   pen2   pig05 low-fat        o       o       x
 6   pen2   pig06 low-fat        o       o       x
 7   pen3   pig07 standard       o       o       o
 8   pen3   pig08 standard       o       o       x
 9   pen3   pig09 standard       o       o       x
10   pen4   pig10 high-fat       o       o       o
11   pen4   pig11 high-fat       o       o       x
12   pen4   pig12 high-fat       o       o       x
13   pen5   pig13 low-fat        o       o       o
14   pen5   pig14 low-fat        o       o       x
15   pen5   pig15 low-fat        o       o       x
16   pen6   pig16 standard       o       o       o
17   pen6   pig17 standard       o       o       x
18   pen6   pig18 standard       o       o       x

Another equivalent way to set the record is using set_rcrds_of(). The format of the input is such that the unit factors are on the LHS and the RHS are character vectors of new record factor names. The reason that names are set as characters is because the factors in the RHS do not yet exist. In contrast, the RHS of the input in set_rcrds() is unquoted because those factors exist. The suffix _of in the function name is purposely chosen to signal that the pattern for the LHS no longer follows that of set_rcrds(), set_trts() and set_units() where the LHS is always a new factor name.

diet_design(diet ~ pen) %>% 
  set_rcrds_of(pig = c("weight", "sex"),
               pen = "manager")

You may choose to also set expecations of the record. For example, we assert below that the weight of a grown pig must be a minimum of 15 kg.

desre <- desr %>% 
  expect_rcrds(weight > 15,
               factor(sex, levels = c("F", "M")))

You might see that it is good principles to be explicit about the records but you may not feel motivated for such verbose coding. There are downstream benefits for this as explained in Sections @ref(simulate) and @ref(export).

14.4 Simulating records

dat <- desre %>% 
  simulate_rcrds(weight = sim_normal(~diet + pen + pig, sd = 25) %>% 
                   params("mean", 
                          diet = c("high-fat" = 300,
                                    "low-fat" = 50,
                                   "standard" = 150),
                          pen = c("pen4" = -270),
                          pig = rnorm(18)),
                 manager = sim_form(~pen) %>% 
                   params(pen = sample(rep(c("John", "Mary", "Jane"), 2))),
                 .seed = 1, .censor = NA)

dat
ggplot(dat, aes(pen, weight, color = diet)) + 
  geom_point(size = 3) 

14.5 Exporting the design

14.6 Aesthetic customisations

The edibble system offers you many levels of aesthetic customisations. These customisation are mostly frivolous and serve as means for users to customise elements visually to their liking. The level of aethetic customisation in the edibble system is probably of little value in the grand aim of constructing an experimental design, but if a user finds the developerā€™s default choice is not to their liking, at least they have the ability to modify it.

14.7 Visualisation

library(deggust)

14.8 Examples

Bailey (2008) Exercise 8.5

In a biotechnology experiment, plants were grown in 90 pots. Five different quantities of potassium were randomized to the pots, so that each quantity was applied to the soil in 18 pots. After each of 15, 30 and 45 days, one plant was randomly chosen from each pot and removed; nine small pieces were cut from it to be used for tissue culture. Three different levels of nutrition were each applied to three of the pieces of each plant removed at each time. After a certain further length of time, the number of plantlets growing in each piece of tissue was counted.

design("Bailey (2008) Exercise 8.5") %>% 
  set_units(pot = 90,
            plant = nested_in(pot, 3),
            piece = nested_in(plant, 9)) %>% 
  set_trts(pottasium = c("q1", "q2", "q3", "q4", "q5"), 
           nutrition = 3) %>% 
  allot_trts(pottasium ~ pot,
             nutrition ~ piece) %>% 
  assign_trts() %>% 
  set_rcrds(plantlets = piece) %>% 
  serve_table()
# Bailey (2008) Exercise 8.5 
# An edibble: 2,430 x 6
       pot    plant     piece pottasium  nutrition plantlets
   <U(90)> <U(270)>  <U(~2k)>    <T(5)>     <T(3)>  <R(~2k)>
     <chr>    <chr>     <chr>     <chr>      <chr>     <dbl>
 1   pot01 plant001 piece0001        q5 nutrition1         o
 2   pot01 plant001 piece0002        q5 nutrition3         o
 3   pot01 plant001 piece0003        q5 nutrition1         o
 4   pot01 plant001 piece0004        q5 nutrition2         o
 5   pot01 plant001 piece0005        q5 nutrition1         o
 6   pot01 plant001 piece0006        q5 nutrition3         o
 7   pot01 plant001 piece0007        q5 nutrition3         o
 8   pot01 plant001 piece0008        q5 nutrition2         o
 9   pot01 plant001 piece0009        q5 nutrition2         o
10   pot01 plant002 piece0010        q5 nutrition2         o
# ā„¹ 2,420 more rows

Bailey (2008) Exercise 9.6

A horticultural research station intends to investigate the effects of two treatment factors on the total weight of apples produced from apple trees. One treatment factor is the method of thinning; that is, removing fruitlets at an early stage of development so that those remaining will be able to grow larger. There are five methods of thinning, coded as A,B,C,D,E. The second treatment factor is the type of grass to grow around the base of the tree to prevent the growth of weeds. There are five types of grass, coded as a,b,c,d,e. It is assumed that there is no interaction between method of thinning and type of grass. There are 25 trees available for the experiment. They are arranged in a 5Ɨ5 rectangle. Construct a suitable design.

df <- design("Bailey (2008) Exercise 9.6") %>% 
  set_trts(thinning = c("A", "B", "C", "D", "E"),
           grass = c("a", "b", "c", "d", "e")) %>% 
  set_units(row = 5,
            col = 5,
            tree = crossed_by(row, col)) %>% 
  set_rcrds(apple_weight_total = tree) %>% 
  allot_table(thinning ~ tree,
              grass ~ tree)

design_anatomy(df)


Summary table of the decomposition for unit & trt (based on adjusted quantities)

 Source.unit df1 Source.trt     df2 aefficiency eefficiency order
 row           4 thinning#grass   4      0.5385      0.4667     2
 col           4 thinning#grass   4      0.5410      0.3333     3
 tree         16 thinning         4      1.0000      1.0000     1
                 grass            4      1.0000      1.0000     1
                 thinning#grass   5      0.5970      0.4000     4
                 Residual         3                              

Table of information (partially) aliased with previous sources derived from the same formula

 Source df Alias                    In  aefficiency eefficiency order
 grass  3  thinning                 trt      0.2667      0.1600     2
 grass  4  ## Information remaining trt      0.7241      0.6000     3

The design is not orthogonal
autoplot(df)

Bailey (2008) Exercise 9.6

A road safety organization wishes to compare four makes of car tyre. The organization has four test cars and four test drivers. One working day is needed to fit new tyres to a car, take it for an exhaustive test-drive, take relevant measurements on tyre treads, record all details of the test-drive, and prepare the car for the next session. The organization has only one week in which to perform its tests. To keep each car properly balanced, the organization has decided that all four tyres on a car at any one time should be of the same make. Construct a suitable design for this trial.

df <- design("Tyre test") %>%
  set_units(car = 4,
            driver = 4,
            day = 5,
            test_drive = crossed_by(day, car, driver),
            #tyre = nested_in(test_drive, 4)
            ) %>% 
  set_trts(make = 4) %>% 
  #set_rcrds(measure = tyre) %>% 
  allot_table(make ~ test_drive)

design_anatomy(df)


Summary table of the decomposition for unit & trt

 Source.unit df1 Source.trt df2 aefficiency eefficiency order
 car           3                                             
 driver        3                                             
 day           4                                             
 test_drive   69 make         3      1.0000      1.0000     1
                 Residual    66                              
autoplot(df)

Bailey (2008) Example 10.19

A manufacturer of household appliances wants to find the best combination of wash temperature and drying temperature to produce unwrinkled cotton sheets at the end of the laundry session. He wants to compare four different wash temperatures and three different drying temperatures. He uses eight similar washing machines and six similar dryers. First, 48 cotton sheets are randomly allocated to the washing machines, six per machine. The wash temperatures are randomly allocated to the washing machines so that two machines are run at each temperature. After the wash, the six sheets in each machine are randomly allocated to the dryers, one per dryer. Then the drying temperatures are randomly allocated to the dryers so that two machines are run at each temperature. After the drying, all 48 sheets are scored by experts for how wrinkled they are.

df <- design("Bailey (2008) Example 10.19") %>%
  set_trts(wash_temp = 4,
           dry_temp = 3) %>% 
  set_units(washer = 8,
            dryer = 6,
            sheet = 48) %>% 
  allot_units(washer ~ sheet,
              dryer ~ washer/sheet) %>% 
  allot_trts(wash_temp ~ washer,
              dry_temp ~ dryer) %>% 
  assign_trts("random") %>% 
  assign_units("random") %>% 
  set_rcrds(score = sheet) %>% 
  serve_table()

design_anatomy(df)
autoplot(df)

Bailey (2008) Example 10.20

An agronomy institute in Brazil conducted a factorial experiment on beans. The treatment factors were cultivar (there were three cultivars), seeding (either conventional or with no tillage), and molybdenum (either applied or not). The experimental area was divided into four blocks. Each block was a 3Ɨ4 rectangle. The three rows were called strips; cultivars were applied to these. The four columns, which were called quarters, were grouped into two halves. The different seeding methods were used in the two halves of each block, and the Mo (molybdenum) was applied to one quarter in each half.

df <- design("Bailey (2008) Example 10.20") %>% 
  set_trts(cultivar = 3,
           seeding = c("conventional", "no tillage"),
           molybdenum = c("applied", "none")) %>% 
  set_units(block = 4,
            strip = nested_in(block, 3),
            half = nested_in(block, 2),
            quarter = nested_in(half, 2),
            plot = nested_in(block, crossed_by(strip, quarter))) %>% 
  allot_table(cultivar ~ strip,
              seeding ~ half,
              molybdenum ~ quarter) 

df
# Bailey (2008) Example 10.20 
# An edibble: 48 x 8
    cultivar     seeding molybdenum  block   strip   half   quarter    plot
      <T(3)>      <T(2)>     <T(2)> <U(4)> <U(12)> <U(8)>   <U(16)> <U(48)>
       <chr>       <chr>      <chr>  <chr>   <chr>  <chr>     <chr>   <chr>
 1 cultivar3 no tillageā€¦    applied block1 strip01  half1 quarter01  plot01
 2 cultivar1 no tillageā€¦    applied block1 strip02  half1 quarter01  plot02
 3 cultivar2 no tillageā€¦    applied block1 strip03  half1 quarter01  plot03
 4 cultivar3 no tillageā€¦    none    block1 strip01  half1 quarter02  plot04
 5 cultivar1 no tillageā€¦    none    block1 strip02  half1 quarter02  plot05
 6 cultivar2 no tillageā€¦    none    block1 strip03  half1 quarter02  plot06
 7 cultivar3 conventionā€¦    none    block1 strip01  half2 quarter03  plot07
 8 cultivar1 conventionā€¦    none    block1 strip02  half2 quarter03  plot08
 9 cultivar2 conventionā€¦    none    block1 strip03  half2 quarter03  plot09
10 cultivar3 conventionā€¦    applied block1 strip01  half2 quarter04  plot10
# ā„¹ 38 more rows
design_anatomy(df)
Warning in pstructure.formula(formulae[[1]], keep.order = keep.order, grandMean
= grandMean, : block:half is aliased with previous terms in the formula and has
been removed


Summary table of the decomposition for unit & trt (based on adjusted quantities)

 Source.unit         df1 Source.trt                  df2 aefficiency
 block                 3                                            
 half[block]           4 seeding                       1      1.0000
                         Residual                      3            
 strip[block]          8 cultivar                      2      1.0000
                         Residual                      6            
 quarter[block:half]   8 molybdenum                    1      1.0000
                         seeding#molybdenum            1      1.0000
                         Residual                      6            
 plot[block]          24 cultivar#seeding              2      1.0000
                         cultivar#molybdenum           2      1.0000
                         cultivar#seeding#molybdenum   2      1.0000
                         Residual                     18            
 eefficiency order
                  
      1.0000     1
                  
      1.0000     1
                  
      1.0000     1
      1.0000     1
                  
      1.0000     1
      1.0000     1
      1.0000     1
                  

Table of information (partially) aliased with previous sources derived from the same formula

 Source      df Alias       In   aefficiency eefficiency order
 half[block] 4  half[block] unit      1.0000      1.0000     1
 half[block] 0  ## Aliased  unit      1.0000      1.0000     1

Bailey (2008) Exercise 10.5

In an experiment into the digestibility of stubble, four feed treatments are to be applied to sheep. There are 16 sheep, in four rooms of four animals each. There are four test periods of four weeks each, separated by two-week recovery periods. Each sheep is to be fed all treatments, one in each test period. During the recovery periods all animals will receive their usual feed, so that they will return to normal conditions before being subjected to a new treatment.

des <- design("Bailey (2008) Exercise 10.5") %>% 
  set_trts(feed = 4) %>% 
  set_units(room = 4,
            sheep = nested_in(room, 4),
            period = 4,
            unit = crossed_by(period, sheep)) %>% 
  allot_table(feed ~ unit)

# anatomy(des)
autoplot(des)

Bailey (2008) Exercise 10.6

An experiment was carried out to find out if so-called ā€˜non-herbicidalā€™ pesticides affect photosynthesis in plants. Six pesticides were compared: diuron, carbofuran, chlorpyrifos, tributyltin chloride, phorate and fonofos. Each of these was dissolved in water at five different concentrations. In addition, plain water was used as a control treatment. Two petri dishes were used for each treatment. Each petri dish was filled with pesticide solution or water. Five freshly cut leaves from mung beans were floated on the surface ofthe solution in each dish. After two hours, the chlorophyll fluorescence of each leaf was measured.

design("Bailey (2008) Exercise 10.6") %>% 
  set_trts(pesticide = c("diuron", "carbofuran", "chlorpyrifos", "tributyltin chloride", 
                         "phorate", "fonofos"),
           concentration = 5) %>% 
  add_trts(pesticide = "none", concentration = NA) %>% 
  set_units(petri = 2 * ntrts())
# emylyn's design
library(edibble)
des1 <- design("sensory evaluation") %>% 
  set_units(day = 4,
            consumer = nested_in(day, 42)) %>% 
  set_trts(cover_story = c("yes", "no"),
           test_half = c("triangle", "paired"),
           test_first = c("triangle/paired", "monadic")) %>% 
  allot_table(test_half:cover_story ~ day,
                         test_first ~ consumer, 
              order = c("systematic", "random"))

library(tidyverse)

des1 %>% 
    filter(test_half == "triangle") %>% 
    select(day, consumer) %>% 
    # do the next design
    restart_design("triangle test") %>% 
    set_units(day, consumer) %>% 
    allot_units(consumer ~ nested_in(day)) %>% 
    # finished described relationship - new design variables
    edbl_design() %>% 
    set_units(test = nested_in(consumer, 2), 
              product = nested_in(test, 3)) %>% 
    set_trts(spiked = c("1/3", "2/3"),
             pattern = c("XXO", "XOX", "OXX")) %>% 
    allot_table(spiked ~ test, 
                pattern ~ product)