STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.
data/data.csv) orC:\\user/myproject/data.csv)You should avoid using absolute path! Why?
getwd() and setwd(), respectively.
.RData, .rda or rds)..RData, .rda or rds saves R objects so you don’t need the data to be in a data.frame.data/template_Morris.xlsx

In RStudio Desktop, you can click on the file for importing via GUI.


Unless you are responsible for entering the data, you should never modify the original, stored data (note: exceptions do apply).
readr::read_csv() and readr::write_csv() to read and write CSV files.readxl::read_xlsx() to read Excel files.saveRDS() (recommended) and multiple objects using save().readRDS() or load().

Quarto integrates text + code in one source document with ability to render to many output formats (via Pandoc), e.g. docx, pdf or html.
R Markdown

Quarto

There are so many possible output formats you can create with Quarto, including but not limited to:
Primary languages supported:
But include engines for many more languages!
These HTML slides are made using Quarto.
These dynamic reports are made using Quarto.
This PhD thesis (online and pdf) is made using Quarto.
Available at https://thesis.patrickli.org/
Quarto via knitr/jupyter: qmd md
Pandoc: md html, pdf, docx
:”!#.true or false (all lowercase) in YAML (not TRUE or FALSE like in R).:, #, -), it should be enclosed in quotes.In Quarto documents, YAML metadata is usually placed at the very top of the document, enclosed by triple dashes ---.
title - the title of the documentsubtitle - the subtitle of the documentauthor - the author of the documentdate - the date of the documentabstract - a brief summary of the documentformat - the output format of the document (e.g., html, pdf, docx)You can find available keys by format at https://quarto.org/docs/reference/
A value can span multiple lines in two ways:
| to preserve line breaks.> to fold lines (line breaks become spaces).RStudio > Help > Markdown Quick Reference
``` with the language specified after the opening backticks.label: label for the chunk (for cross-referencing)eval: whether to evaluate the code (true or false)echo: whether to show the code in the output (true or false)fig-width: width of the figure (in inches)fig-height: height of the figure (in inches)fig-cap: caption for the figureSee more options for the knitr engine at here.
python:The following languages are supported by knitr:
asis, asy, awk, bash, block, block2, bslib, c, cat, cc, coffee, comment, css, ditaa, dot, embed, eviews, exec, fortran, fortran95, gawk, glue, glue_sql, gluesql, go, groovy, haskell, highlight, js, julia, lein, mermaid, mysql, node, octave, ojs, perl, php, psql, python, r, rcpp, rscript, ruby, sas, sass, scala, scss, sed, sh, sql, stan, stata, targets, tikz, verbatim, webr, zsh

`r some_r_code()`
The number of observations in the ChickWeight dataset is 578.
The value of \(\pi\) is 3.1415927.
engine: knitr..bib files.citation function.To cite ggplot2 in publications, please use
H. Wickham. ggplot2: Elegant Graphics for Data
Analysis. Springer-Verlag New York, 2016.
A BibTeX entry for LaTeX users is
@Book{,
author = {Hadley Wickham},
title = {ggplot2: Elegant Graphics for Data Analysis},
publisher = {Springer-Verlag New York},
year = {2016},
isbn = {978-3-319-24277-4},
url = {https://ggplot2.tidyverse.org},
}
You can cite references like:
[@key] or @key, where key is the citation key defined in the .bib file.ref.bib
@Book{ggplot2,
author = {Hadley Wickham},
title = {ggplot2: Elegant Graphics for Data Analysis},
publisher = {Springer-Verlag New York},
year = {2016},
isbn = {978-3-319-24277-4},
url = {https://ggplot2.tidyverse.org},
}
@Manual{rstats,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2025},
url = {https://www.R-project.org/},
}The chunk label with prefix fig- can be referenced in text as a figure.
The body mass distribution of penguins is shown in Figure 2.
ggplot2 package to create a figure, but you can use base R plotting functions or other packages.The chunk label with prefix tbl- can be referenced in text as a table.
| species | Mean | SD | N |
|---|---|---|---|
| Adelie | 3700.7 | 458.6 | 152 |
| Gentoo | 5076.0 | 504.1 | 124 |
| Chinstrap | 3733.1 | 384.3 | 68 |
knitr::kable() function to create a table.gt, flextable, kableExtra, etc.If you have enabled numbering of sections:
then you can refer to them by their label prefixed by sec-.
Weave together text, code, and output (figures, tables, etc.) in a single document using Quarto into various output formats (HTML, PDF, Word, etc.).





What should have been submitted:

Literate programming is a programming paradigm introduced by Donald Knuth where it emphasises writing code for humans (i.e. intertwine code with natural language explanations).
Tidy data
Tools
… a statistical value chain is constructed by defining a number of meaningful intermediate data products, for which a chosen set of quality attributes are well described …
– van der Loo & de Jonge (2018)
A suggested folder structure for data projects:
project-root-folder/ # Root of the project folder
│
├── README.md # README file
│
├── data/ # Raw and derived data
│ ├── data-raw/ # Read-only files
│ ├── data-input/ # Extracted and coerced from raw data
│ ├── data-valid/ # Edit and imputed from input data
│ └── data-stats/ # Analysed results (R objects, .csv, etc.)
│
├── analysis/ # Scripts (not functions) to run analysis
│
├── figures/ # Figures (.png, .pdf, etc.)
│
├── misc/ # Misc
│
├── report.qmd # Report, paper, or thesis output
via Quarto Pubs
Happy writing and sharing 😊

STAT1003 – Statistical Techniques