
Lessons from the Tidyverse
Australian National University
28th October 2025



Lessons from the Tidyverse



'data.frame': 1420 obs. of 13 variables:
$ gen : Factor w/ 9 levels "Danja","Gungurru",..: 3 2 4 2 2 8 1 2 1 4 ...
$ site : Factor w/ 11 levels "S01","S02","S03",..: 1 1 1 1 1 1 1 1 1 1 ...
$ rep : Factor w/ 3 levels "R1","R2","R3": 1 1 1 1 1 1 1 1 1 1 ...
$ rate : int 40 60 40 50 40 40 10 10 60 10 ...
$ row : int 1 2 3 4 5 6 7 8 9 10 ...
$ col : int 1 1 1 1 1 1 1 1 1 1 ...
$ serp : Factor w/ 4 levels "SE1","SE2","SE3",..: 1 2 3 4 1 2 3 4 1 2 ...
$ linrow : num -0.75 -0.65 -0.55 -0.45 -0.35 -0.25 -0.15 -0.05 0.05 0.15 ...
$ lincol : num -2.5 -2.5 -2.5 -2.5 -2.5 -2.5 -2.5 -2.5 -2.5 -2.5 ...
$ linrate: num -0.169 1.831 -0.169 0.831 -0.169 ...
$ yield : num 0.617 1.194 1.099 0.941 0.983 ...
$ year : int 91 91 91 91 91 91 91 91 91 91 ...
$ loc : Factor w/ 8 levels "Badgingerra",..: 1 1 1 1 1 1 1 1 1 1 ...
Base R
Subset the data to the year 1991
Base R
Select the columns gen, loc, and yield
Base R
Multiply yield by 10 to convert t/ha to kg/ha
Base R
Get the mean yield by genotype and location
Base R
Arrange results by descending mean yield
library(dplyr) (Tidyverse)
gen loc yield
1 Merrit MtBarker 17.175571
2 Gungurru MtBarker 15.887556
3 Warrah MtBarker 15.725571
4 Danja MtBarker 14.991444
5 Yorrel Newdegate 14.598500
6 Unicrop MtBarker 14.170000
7 Danja Corrigin 13.582083
8 Yandee Corrigin 13.184667
9 Illyarrie MtBarker 13.078857
10 Gungurru Corrigin 12.657417
11 Danja Newdegate 12.476917
12 Unicrop Corrigin 12.067000
13 Yorrel MtBarker 11.992286
14 Yorrel Corrigin 11.767167
15 Merrit Newdegate 11.762583
16 Illyarrie Corrigin 11.655583
17 Warrah Corrigin 10.219583
18 Danja WonganHills 10.051417
19 Illyarrie Newdegate 9.702250
20 Yandee MtBarker 9.593091
21 Yorrel WonganHills 9.233667
22 Yandee Newdegate 8.911750
23 Gungurru WonganHills 8.591333
24 Unicrop WonganHills 8.044667
25 Illyarrie WonganHills 8.022750
26 Merrit Badgingerra 6.997583
27 Warrah Newdegate 6.976417
28 Gungurru Badgingerra 6.601833
29 Danja Badgingerra 6.348167
30 Yandee Badgingerra 5.981750
31 Warrah Badgingerra 5.864500
32 Illyarrie Badgingerra 5.545417
33 Unicrop Badgingerra 4.624083
34 Yorrel Badgingerra 4.211500
35 Merrit Corrigin NA
36 Gungurru Newdegate NA
37 Unicrop Newdegate NA
38 Merrit WonganHills NA
39 Yandee WonganHills NA
40 Warrah WonganHills NA
Base R
dplyr, you need a tiny bit more effort (install and load the package)dplyr for data wrangling in R“[Tidyverse’s] primary goal is to facilitate the conversation that a human has with a dataset, and we want to help dig a “pit of success” where the least-effort path trends towards a positive outcome. The primary tool to dig the pit is API design: by carefully considering the external interface to a function, we can help guide the user towards success”
Tanaka (2025) Australian & New Zealand Journal of Statistics (to appear)
“While Tidyverse has been lauded for adopting a user-centered design, arguably some elements of the design focus on the work domain instead of the end-user.”


Skill-based behaviour
Automatic actions performed with little conscious thought

Rule-based behaviour
Actions guided by stored rules or procedures

Knowledge-based behaviour
Actions that require conscious problem solving and decision making

Tanaka (2025) Examining the Interface Design of Tidyverse. ANZJS (to appear) https://arxiv.org/abs/2510.10382
“We recommend that developers adopt an iterative design that is informed by user feedback, analysis and complete coverage of the work domain, and ensure perceptual visibility of system constraints and relationships.”

Emi Tanaka - Toward a unified system for crop analytics: Lessons from the Tidyverse