STAT1003 – Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.


One function
One complete plot type
The number of plots that can be drawn
The number of plot functions

ggplot2 R packageggplot2 R package is part of the tidyverse suite of R packagesggplot2 is widely used by the scientific community and even by news outlets (e.g. Financial Times and BBC)Wilkinson (2005) introduced “the grammar of graphics” as a paradigm to describe plots by combining a finite number of components.
ggplot2 R package (as part of his PhD project).plotnine), Julia (e.g., Gadfly.jl, VegaLite.jl), and Javascript (e.g. VegaLite).ggplot
data.frame
geom_histogram()geom_density()stat_ecdf()stat_qq()geom_boxplot()geom_violin()geom_jitter()faithful is a built-in data set in Rgeom layers in ggplot2stat layers in ggplot2ggplotggplot has five main components:
geom - the geometric object to use display the datastat - statistical transformation to use on the data datadata to be displayed in this layer (usually inherited)mapping - aesthetic mappings (usually inherited)position - position adjustment

y = stat(density) andy = ..density..Density is calculated as the count divided by the total number of observations and the bin width.
geom_point()geom_smooth()geom_bin2d()geom_hex()geom_line()
Palmer penguins

vignette("ggplot2-specs")
?geom_point).Common aesthetics include:
x and y
alpha
color
fill
size
Data variables:
species island bill_len bill_dep flipper_len body_mass sex year
geom_point()Make the following target plot:

shape
stroke vs size
stroke and fill is only for the “filled” shapes.color
linetype
linewidth
lineend
linejoin
aes, it assumes that it’s a data variable."dodgerblue" gets converted into a variable with one level and it gets colored by ggplot’s default color palette.Don’t put attributes inside aes()!
Make this target plot:

Use I() operator to mean “as-is” in aesthetic mapping.
Attributes should be defined in specific layers.
ggplot() but not attributes.
data.framegeom_ or stat_ functions) which describes what to renderx, y, color, fill, size, alpha, shape, linetype, linewidth, etc.
geom - the geometric object to use display the datastat - statistical transformation to use on the data datadata to be displayed in this layer (usually inherited)mapping - aesthetic mappings (usually inherited)position - position adjustmentggplot2 cheatsheet

geom_bar()geom_col()geom_point()geom_tile()geom_density()geom_bar()stat = "count" is computing the frequencies for each category for you.stat_count() and change the geom.geom_col()stat = "count" to do the counting for you and use geom_col() instead.geom_bar(stat = "identity") where stat = "identity" means that you will take the value as supplied without any statistical transformation.position_dodge() for grouped barplotsposition_dodge2() for improved grouped barplotsposition_fill() for stacked percentage barplotsposition_identity() to use the raw positionsposition_jitter() to add random noise to pointsposition_jitterdodge() for jittered and dodged pointsposition_nudge() to shift the position by a fixed amountposition_stack() for stacked barplotscoord_polar()

coord_cartesian() for Cartesian coordinate systems (default)coord_equal() is essentially coord_fixed(ratio = 1)coord_fixed() to use a fixed aspect ratiocoord_flip() to flip the x and ycoord_map() to use projection based on mapprojcoord_munch() to improve rendering of large datasetscoord_polar() to use polar coordinatescoord_quickmap() for quick map coordinate systemcoord_radial() for radial coordinatescoord_sf() for spatial data framescoord_transform() to transform the coordinate after the statistical transformationposition_dodge()position_dodge2()position_fill()position_identity()position_jitter()position_jitterdodge()position_nudge()position_stack()coord_cartesian()coord_equal()coord_fixed()coord_flip()coord_map()coord_munch()coord_polar()coord_quickmap()coord_radial()coord_sf()coord_transform()ggplot2
annotate() allows you to add elements to plots without a data.frameggplot2theme() to modify non-data components of the plotfacet_wrap() and facet_grid() for small multiplesscale_*() to modify scalesguides() to modify legendslabs() to modify labels and titlesggsave() to save plots to filesggplot2 extensionsggincertaggincerta to visualise uncertainty in ggplot2 as part of her PhD work!ggplot2.
ggplot2 to explore!
STAT1003 – Statistical Techniques