Getting started with R

STAT1003 – Statistical Techniques

Dr. Emi Tanaka

Australian National University

These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.

What is R?

  • R is a programming language predominately for data analysis.
  • RStudio Desktop is an integrated development environment (IDE) that helps you to use R.

  • Visual Studio Code and Positron are other popular IDEs.

Interactively working with R

  • You can use R like a calculator:
    1. \(1 + 1\)
    2. \(\dfrac{6}{2}+ 0.5\)
    3. \((1 - 4) \times 3 - 6^2\)

How do you use R?

  • RStudio Desktop (or RStudio IDE) is the most common way to use R.

Customise Global Options

  • Go to RStudio > Tools > Global Options…
  • Under the General tab, make sure the “Restore .RData into workspace at startup” is unticked.
  • This avoids unexpectedly loading (old) data into your workspace and making your code only work in your workspace, but not for others (which is bad reproducible practice).

Arithmetics

  1. \(\sqrt{3}\)

  2. \(|-3|\)

  3. \(e^1 = e\)

  4. \(\log_e (4) = \ln (4)\)

  5. \(1 + 2 + 3 = \displaystyle\sum_{i = 1}^3 i\)

Functions

  • There are many functions in R.
  • You can look at the documentation on how to use it:

Finding functions

  • To find indexed functions for a package:
  • Google it with a good set of keywords.
  • The recent trend is ask a large language model.

Why learn R?

  • R is one of the top programming languages for statistics or data science.
    • Python is also a good alternative language for data science.
    • Better to have a mastery of at least one language rather than none.
  • R was initially developed by statisticians for statisticians.
    • State-of-the-art statistical methods are more readily available in R.
  • R has a very active and friendly community.
  • R is a free and open source software (FOSS) and is a cross-platform language:
    • free = money is not a barrier to use it,
    • open source software = transparency,
    • cross-platform = can be used on Windows, Mac, and Linux.

Base R

R has 7 packages:

  • base,
  • datasets,
  • graphics,
  • grDevices,
  • utils,
  • stats,
  • methods,

collectively referred to as “Base R”, that are loaded automatically when you launch it.

  • The functions in the base packages are generally well-tested and trustworthy.

Contributed R Packages

  • R packages are community developed extensions to R (much like apps on your mobile).
  • The Comprehensive R Archive Network (CRAN) is a volunteer maintained repository that hosts submitted R packages that are approved (much like an app store).
  • There are close to 20,000 packages available on CRAN but the qualities of R packages vary.
  • There are other repositories that host R packages, e.g. Bioconductor for bioinformatics, R Universe, R-Forge, GitHub (we won’t cover these).

Photo by Sara Kurfeß on Unsplash

Using packages on CRAN

  • If the package (say praise) is on CRAN, you can install it by:
install.packages("praise")
  • You only need to install.packages() once!
  • Loading exported functions from a package:
  • Use package::function() for without loading package:

Summary

RStudio Desktop (or RStudio IDE)

Console or Source


  • Use ?function or help(function) to look at the function documentation
  • Use install.packages() to install a package (only once).
  • Use library() to load a package.
  • Use package::function() to use a function from a package without loading it.

RStudio Desktop Cheatsheet

RStudio Desktop Cheatsheet