Current state and prospects of R-packages for the design of experiments
Abstract
Re-running an experiment is generally costly and, in some cases, impossible due to limited resources; therefore, the design of an experiment plays a critical role in increasing the quality of experimental data. In this paper, we describe the current state of R-packages for the design of experiments through an exploratory data analysis of package downloads, package metadata, and a comparison of characteristics with other topics. We observed that experimental designs in practice appear to be sufficiently manufactured by a small number of packages, and the development of experimental designs often occurs in silos. We also discuss the interface designs of widely utilized R packages in the field of experimental design and discuss their future prospects for advancing the field in practice.
Keywords: experimental design, CRAN task view, interface design.
1 Introduction
The critical role of data collection is well captured in the expression “garbage in, garbage out” – in other words, if the collected data are rubbish, then no analysis, however complex it may be, can make something out of it. Therefore, a carefully crafted data-collection scheme is critical for optimizing the information from the data. The field of experimental design is specifically devoted to planning the collection of experimental data, largely based on the founding principles of Fisher (1935) or an optimization framework like those described in Pukelsheim (2006). These experimental designs are often constructed with the aid of statistical software such as R (R Core Team 2021), Python (Rossum 1995), and SAS (SAS Institute 1985); thus the use of experimental design software can inform us about some aspects of experimental designs in practice.
Methods for data collection can be dichotomized by the type of data collected – namely, experimental or observational – or alternatively, categorized as experimental design (including quasi-experimental design) or survey design. This dichotomization, to a great extent, is seen in the Comprehensive R Archive Network (CRAN) task views (a volunteer maintained list of R-packages by topic) where R-packages for experimental design are in ExperimentalDesign task view and R-packages for survey designs are in OfficialStatistics task view. A full list of available topics is provided in Table S1 in the Supplementary Materials. A subset of experimental designs is segregated into the ClinicalTrials task view, where the focus is on clinical trials with primary interest in sample size calculations. This paper focuses on packages in ExperimentalDesign task view, henceforth referred to as “DoE packages”.
From the ExperimentalDesign task view, there are 105 R packages for the experimental design and analysis of data from experiments. The sheer quantity and variation of experimental designs in the R-packages are arguably unmatched with any other programming languages, for example, in Python, only a handful of packages that generate design of experiment exist (namely pyDOE
, pyDOE2
, dexpy
, experimenter
, and GPdoemd
) with a limited type of design. Thus, the study of DoE packages, based on quantitative and qualitative data, can provide an objective view of the state of current experimental designs in practice.
The utility of the software can also be described by its design to facilitate the clear expression and interpretation of the desired experimental design. Certain programming language designs can hinder or discourage the development of reliable programs (Wasserman 1975). The immense popularity of tidyverse
(a collection of R-packages for various stages of data analysis that places enormous emphasis on the interface design by Wickham et al. 2019) is a testament to the impact that an interface design can have in practice. The practice of experimental design can be advanced by adopting similar interface design principles across the DoE packages.
The remainder of this paper is organized as follows. Section 2 briefly describes the data source used for the analysis; Section 3 presents some insights into the state of the current DoE packages by the exploratory data analysis of package download data, text descriptions and comparisons with other CRAN task views; Section 4 discusses the interface designs of widely used DoE packages, and we conclude with a discussion in Section 5 of future prospects in the software development of experimental designs.
2 Data
To study the DoE packages, we analyse data using three sources of data as described below.
2.1 RStudio CRAN download logs
The Comprehensive R Archive Network (CRAN) is a network of servers located across the world that stores mirrored versions of the R and R packages. The most popular network is the RStudio mirror (the default server for those that use the RStudio IDE). The RStudio mirror is also the only server that provides a comprehensive daily download logs of R and R packages since October 2012. The summary data can be easily accessed using the cranlogs
package (Csárdi 2019). This paper uses the data from the beginning of 2013 to the end of 2021 (a total of nine years) for the packages in the CRAN task views.
2.2 Package descriptions
All CRAN packages have a title, description, package connections (suggests, depends, and imports of other packages), and other meta-information in the DESCRIPTION file. We use text data from the title and description (accessed in 2022-12-12).
2.3 CRAN task views
CRAN task views are volunteer-maintained lists of R-packages on CRAN relevant to the corresponding topic. There were 39 CRAN task views in total. Table S1 in the Supplementary Materials list the available topics from the ctv
package (Zeileis 2005).
The list of packages in each CRAN task view (as of 2022-12-12) is used to contrast the characteristics of the DoE packages.
3 Exploratory data analysis
In this section, we derive some conjectures based on an analysis of the data described in Section 2. All results presented are from exploratory data analysis of observational data, consequently, all interpretations are somewhat speculative and may not be indicative of the true state of the field of experimental design. In particular, any analysis over time is confounded by the fact that the nature of users and package management has changed over the years. It should be noted that some DoE packages may have been archived or removed from the task view over the years; therefore, any cross-sectional analysis presented may not reflect the set of all DoE packages at that particular time period (although we assume such incidences are low).
A subset of DoE packages is not primarily about the design of experiments but about the analysis of experimental data. A complete delineation of these packages is difficult, as there is almost always at least one function that can aid decisions or constructions of experimental designs (and any categorization is prone to our subjective bias); therefore we opted not to remove any DoE packages in the analysis.
3.1 Small, but diverse, set of packages are sufficient for most experimental designs in practice
There are at least 50 DoE packages since 2013 but most of the downloads are concentrated in only a handful of packages. For example, Figure 3.1 shows a Lorenz curve (Lorenz 1905) for the total package downloads in 2021 for 102 DoE packages (first released prior to 2021). We can see from Figure 3.1 that the bottom 90% of DoE packages (in terms of total download count in 2021) only share approximately 32% of total downloads across all DoE packages; in other words, 68% of the total downloads are due to 10 packages (10% of the DoE packages).