Dealing with dates and time

STAT1003 – Statistical Techniques

Dr. Emi Tanaka

Australian National University

These slides are best viewed on a modern browser like Google Chrome on a desktop or laptop. Some interactive components may require some time to fully load.

Date in R

  • Dates in R have class Date 📅 even though it looks like character 🔢
  • It’s actually a numerical value under the hood

Reference point for Date objects

  • 1st January 1970 is a special reference point

  • Let’s have a look at the numerical value under the hood of Date objects

  • Yup, the number under the hood is the number of days after (if positive) or before (if negative) 1st January 1970

  • And yes, you can use as.Date to convert objects to Date

Converting string to Date

  • Dates do no have to be in the format of “YYYY/MM/DD” (in fact, there are many format in the wild)
  • If it has a different format, then you can use the conversion specification with a “%” symbol followed by a single letter not quite regex, but like it
  • You can find some widely used conversion specification in documentation at
    ?strptime but some depends on your operating system

  • Below are some common ones:

  • %b abbreviated month
  • %B full month
  • %e day of the month (01, 02, …, 31)
  • %d day of the month (1, 2, …, 31)
  • %y year without century (00-99)
  • %Y year with century, e.g. 1999

System locale

  • “aralık” is December in Turkey
as.Date("Xmas is 25 aralık 2020", format = "Xmas is %d %B %Y")
[1] NA
  • Let’s temorary set our system locale to Turkey
Sys.setlocale("LC_TIME", "tr_TR.UTF-8") # temporary set to Turkey locale
[1] "tr_TR.UTF-8"
as.Date("Xmas is 25 aralık 2020", format = "Xmas is %d %B %Y")
[1] "2020-12-25"

(And set it back to English again) “UTF-8” might only work for Unix and Linux systems

Sys.setlocale("LC_TIME", "en_AU.UTF-8")
[1] "en_AU.UTF-8"

Date and Time in R: POSIXct

  • R has two main date-time classes in R: POSIXct and POSIXlt (avoid using POSIXlt if possible)

  • POSIX stands for Portable Operating System Interface

  • ct stands for calendar time

  • 1970/01/01 00:00:00 UTC is a special reference point called Unix epoch and the above number is the number of seconds after Unix epoch

Date and Time in R: POSIXlt

  • POSIXlt seems like it’s the same as POSIXct
  • But under the hood, it’s a list of time attributes

Time zone

  • You can find the names of the time zones using OlsonNames()
  • If you want to know which time zone your system is using:

Date in R with lubridate

  • To convert string to a Date, you can use ymd and friends. E.g.

You might have guessed it but:

  • y = year, m = month, and d = day.

The order determines the expected order of its appearance in the string

Date and time in R with lubridate

  • To convert string to POSIXct, you can use ymd_hms and friends
  • y = year, m = month, and d = day
  • h = hour, m = minute, and s = second.

It’s remarkably clever!

The time has to be after date though.

Conversion to date and time with lubridate

Making Date from individual date components:

Making POSIXct from individual components:

Extracting date or time components with lubridate

Date and time modifiers

Durations

  • Duration is a special class in lubridate
  • Some convenient constructors for Duration are:

Maths with Durations

  • Day light saving started at Sun 4th Oct 2020 2AM in Melbourne

Period

  • Period is a special class in lubridate
  • Constructors for Period are like for Duration but without the prefix “d”:

Maths with Period

lubridate cheatsheet