STAT1003 β Statistical Techniques
Dr. Emi Tanaka
Australian National University
These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.
' or double " quotesThe string may be manipulated using Base R functions, e.g. paste0(), strsplit()
But instead we use the stringr package from the Tidyverse.
stringr package is powered by the stringi package, which in turn uses the ICU C library to provide fast performance for string manipulation.
Main functions in stringr prefix with str_ (stringi prefix with stri_) and the first argument is a string (or a vector of strings)
What do you think str_trim and str_squish do?
stringr?stringr ensures consistency in syntax and user expectationaddress that is comprised of street number, street name, suburb, state (or territory), and postcode.<street number> <street name>, <suburb> <state> <postcode>[digits] [alphabets], [alphabets] [NSW|VIC|WA|ACT|QLD|SA|NT|TAS] [4 digits]
[:digit:] or [0-9] matches any digit (0-9). matches any single character[:alpha:] or [A-Za-z] matches any alphabetic character (a-z, A-Z)+ matches 1 or more of the preceding character( and ) are used to create capture groups| acts as a logical OR{n} matches exactly n occurrences of the preceding characterBut in the context of data, it may be better to use the separate_wider_regex() from tidyr package.
π― Extract the LGA status from the data
Recall: paste0(), paste() or stringr::str_c() can combine strings:
{}:stringr cheatsheet


STAT1003 β Statistical Techniques