R Objects

STAT1003 – Statistical Techniques

Dr. Emi Tanaka

Australian National University

These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.

Using R as a calculator

  • \(e^{3 + 4}\)
  • \(e^{3 + 4} + \frac{1}{3}(1 + 3 + 5)\)
  • But we want to save results to reuse later!

Assignment

  • You can assign values to objects using <- or = or even ->
  • Just be consistent which one you use!
  • The name of the object can be variable so long as it is syntactically valid (no spaces and most special characters, and the name cannot start with a digit)

Vectors

  • We can combine scalars to form vectors using c():
  • This is a vector of length 3
  • This vector is stored as a double with the class as numeric

Vector types

There are four primary types of atomic vectors: logical, integer, double and character.

  • If a logical value is coerced to numeric or integer, then
    • TRUE is 1 and
    • FALSE is 0.

Vector coercion

  • A vector can only consist of the same type.
  • If you attempt to combine mismatched types together, it will try to coerce all values to the same type.
  • There are functions to explicitly coerce types, e.g., as.numeric() tries to coerce input to numeric value.

Factor

A factor in R is a special type of integer vector used typically to encode categorical variables.

Lists

  • Lists allow to combine elements of different types.
  • You can use str() to see the internal structure of an object in R.

Data frames

data.frame is a special type of a named list where each element of the vector is the same length.

  • tibble is a Tidyverse version of data.frame in R.
  • It is still a data.frame, so all functions that work with data.frame objects will also work with tibble objects.

Subsetting vectors Part 1

A vector can be subsetted using integers in [].

  • Positive integers select elements at the specified positions:
  • Negative integers exclude elements at the specified positions:

Subsetting vectors Part 2

Logical vectors in [] select elements where logical value is TRUE.

  • If the logical vector used for subsetting a vector is shorter than it then the logical vector is recycled to match the length of the vector.

Subsetting named vectors

Character vectors select elements based on the name of the vector (if any):

Subsetting lists

Lists can be subsetted using integers in [] or names with $ or [[ ]].

Subsetting data frames

A data.frame can be subsetted using integers in [ , ] or names with $ or [[ ]].

Missing values

  • NA in R denotes missing values – there are in fact different types of missing values (NA_character_, NA_integer_, NA_real_, NA_complex_, NA_Date_, NA_POSIXct_).
  • When there are missing values, it can cause issues in the computation.
  • Below we remove the missing values:

Summary

  • Four primary types of atomic vectors: logical, integer, double and character.
  • A vector can only consist of the same type.
  • Other objects types: factor, list, and data.frame.
  • There were several ways of subsetting vectors and lists.
  • Missing values represented as NA and may need to be handled specially.

Base R Cheatsheet

Base R Cheatsheet