Scales and guides in ggplot2

Data Visualisation with R

👩🏻‍💻 Emi Tanaka @ Monash University

  • emi.tanaka@monash.edu
  • @statsgen
  • github.com/emitanaka
  • emitanaka.org



28th November 2022 Australasian Applied Statistics Conference 2022

Scales

  • Scales control the mapping from data to aesthetics
  • They usually come in the format like below:
  • E.g. scale_x_continuous(), scale_fill_discrete(), scale_y_log10() and so on.

Guide

  • The scale creates a guide: an axis or legend
  • To modify these you generally use scale_*, guide_* within guides or other handy functions (e.g. labs, xlab, ylab, and so on).

Guides for scales

library(palmerpenguins)
ggplot(penguins, 
       aes(x = bill_depth_mm,
           y = bill_length_mm, 
           color = flipper_length_mm, 
           shape = species, 
           size = body_mass_g)) +
  geom_point()

Guides for scales

library(palmerpenguins)
ggplot(penguins, 
       aes(x = bill_depth_mm,
           y = bill_length_mm, 
           color = flipper_length_mm, 
           shape = species, 
           size = body_mass_g)) +
  geom_point() +
  # defaults
  guides(x = guide_axis(),
         y = guide_axis(),
         color = guide_colorbar(),
         shape = guide_legend(),
         size = guide_legend())

Guides for scales

library(palmerpenguins)
ggplot(penguins, 
       aes(x = bill_depth_mm,
           y = bill_length_mm, 
           color = flipper_length_mm, 
           shape = species, 
           size = body_mass_g)) +
  geom_point() +
  guides(
    x = guide_axis(position = "top"),
    y = guide_axis(angle = 30),
    color = guide_colorsteps(order = 1),
    shape = guide_legend(title.position = "bottom"),
    size = guide_bins(title = "body mass")
  )

Scale demonstrations

Modifying axis

ggplot(diamonds, aes(carat, price)) + 
  geom_hex() + 
  scale_y_continuous(
    name = "Price", 
    breaks = c(0, 10000),
    labels = str_wrap(c("0", "More than 10K"), 10)
  ) + 
  geom_hline(yintercept = 10000, color = "red", size = 2)

  • Notice how the axis title has been modified to “Price”
  • The breaks are at 0 and 10000
  • And the associated labels for the breaks are “0” and “More than 10K”

Modifying labels

ggplot(diamonds, aes(carat, price)) + 
  geom_hex() + 
  scale_y_continuous(
    label = scales::dollar_format()
  )

  • Sometimes you may want to modify the labels based on it’s existing axis label.
  • You can give a function to the label instead.
  • Most of the handy functions are in the scales package.

Modifying legend scale

ggplot(diamonds, aes(carat, price)) +
  geom_hex() + 
  scale_fill_continuous(
    breaks = c(0, 10, 100, 1000, 4000),
    trans = "log10"
  )

  • An axis is not just the x-axis and y-axis!
  • The legend can have an axis and we can modify its scale as well.
  • We transform the scale into a log10 format with breaks defined at 0, 10, 100, 1000, and 4000.

Removing legend

ggplot(diamonds, aes(carat, price)) + 
  geom_hex() + 
  scale_fill_continuous(
    guide = "none"
  )

  • If you want to remove a legend for an associated aesthetic, you can use guide = "none" in the associated scale.
  • There are other handy ways of doing this as well!

Alternative control of guides

  • There are generally other ways of modifying the scales
ggplot(diamonds, aes(carat, price)) + 
  ylab("Price") + # Changes the y axis label
  labs(x = "Carat", # Changes the x axis label
       fill = "Count") # Changes the legend name
ggplot(diamonds, aes(carat, price)) + guides(fill = "none") # remove the legend
  • Each user has a different mental mode, so you can use what suits you (and others in your team)

Color

Color palettes

  • There are a few different color palettes… choose what suits your purpose!
ggplot(diamonds, aes(carat, price)) +
  geom_hex() + 
  scale_fill_viridis_c(option = "A")

Color space

Zeileis, Fisher, Hornik, Ihaka, McWhite, Murrell, Stauffer, Wilke (2019). colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes. arXiv 1903.06490

Zeileis, Hornik, Murrell (2009). Escaping RGBland: Selecting Colors for Statistical Graphics. Computational Statistics & Data Analysis 53(9) 3259-3270

Qualitative palettes

  • Designed for categorical variable with no particular ordering
colorspace::hcl_palettes("Qualitative", plot = TRUE, n = 7)

Sequential palettes

  • Designed for ordered categorical variable or number going from low to high (or vice-versa)
colorspace::hcl_palettes("Sequential", plot = TRUE, n = 7)

Diverging palettes

  • Designed for ordered categorical variable or number going from low to high (or vice-versa) with a neutral value in between
colorspace::hcl_palettes("Diverging", plot = TRUE, n = 7)

RGB color space

Made for screen projection

HCL color space

Made for human visual system

colorspace 📦

  • Interactively choose/create a palette using the HCL color space.
library(colorspace)
hcl_wizard() # OR choose_palette()

hcl_wizard


Choose your palette > Export > R > Copy the command

Registering your own palette

library(colorspace)
# register your palette
sequential_hcl(n = 7, 
               h = c(300, 200), 
               c = c(60, 0), 
               l = c(25, 95), 
               power = c(2.1, 0.8), 
               register = "my-set")
# now generate from your palette
sequential_hcl(n = 3, 
               palette = "my-set")
[1] "#6B0077" "#7C8393" "#F1F1F1"
hcl_palettes(n = 5, palette = "my-set", plot = T)

Applying your own palette with scale_

Combining with ggplot:

ggplot(penguins, 
       aes(bill_length_mm, fill = species)) + 
 geom_density(alpha = 0.6) + 
  # notice here you don't need to specify the n!
 scale_fill_discrete_sequential(palette = "my-set")

Manually selecting colors

g <- ggplot(penguins, 
       aes(bill_length_mm, fill = species)) + 
 geom_density(alpha = 0.6) + 
 scale_fill_manual(
   breaks = c("Adelie", "Chinstrap", "Gentoo"), # optional but makes it more robust
   values = c("darkorange", "purple", "cyan4"))
g

Colorblindness

Colorblindness affect roughly 1 in 8 men.

colorblindr::cvd_grid(g)

Check your color choices using the colorblindr package or otherwise.

Summary

  • Scales the control the mapping from data to aesthetics
  • Scales creates a guide that allows you to “read” the data from the plot.
  • Scales and guides are primarily modified using scale_ functions or guide_ functions within guides().
  • There are many in-built color palettes that you can choose from but be wary to check how color blind friendly they are.

Your turn!

30:00

> Go to emitanaka.org/dataviz-workshop/exercises/
> Click Exercise 5