Presentation ready plots 2:
Telling a story

Lecture 19

Dr. Mine Çetinkaya-Rundel

Duke University
STA 113 - Fall 2023

Warm-up

Announcements

  • Project 2 presentations in class on Thursday – food preference?

  • Any questions?

Exam 2 redo (optional)

  • Due: Monday 12/11 at 5 pm
  • Take home:
    • Message me on Slack to let me know if you want to work on this.
    • Work in exam-2-redo.qmd, this is a copy of your exam submission, without any changes I might have implemented to get it to render – do not overwrite exam-2.qmd.
  • In class:
    • Write/type up corrections + reasoning for corrections (even for questions that didn’t originally ask for reasoning) on a separate piece of paper.
    • Return original exam + redo together at my office (slide under door if I’m not there).
  • Improve your answers working on your own. The same rules as the exam applies.
  • You will be eligible to receive up to 50% of the points you missed on each portion of the exam.

Code review

  • Clone your assigned team’s project and (try to) render it
  • First, review the organization of the project/repo
  • Then, review the code and open issues associated with any lines of code you want to make specific comments about [DEMO]
  • Finally, fill out the “Code review” issue
  • Be critical, but constructive
25:00

Setup

# load packages
library(tidyverse)
library(palmerpenguins)
library(fs)
library(openintro)
library(glue)

# set theme for ggplot2
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 14))

Animation (as requested)

Philosophy

  • The purpose of interactivity is to display more than can be achieved with persistent plot elements, and to invite the reader to engage with the plot.

  • Animation allows more information to be displayed, but developer keeps control

  • Beware that it is easy to forget what was just displayed, so keeping some elements persistent, maybe faint, can be useful for the reader

gganimate

  • gganimate extends the grammar of graphics as implemented by ggplot2 to include the description of animation

  • It provides a range of new grammar classes that can be added to the plot object in order to customize how it should change with time

How does gganimate work?

  • Start with a ggplot2 specification

  • Add layers with graphical primitives (geoms)

  • Add formatting specification

  • Add animation specification

Grammar of animation

  • Transitions: transition_*() defines how the data should be spread out and how it relates to itself across time

  • Views: view_*() defines how the positional scales should change along the animation

  • Shadows: shadow_*() defines how data from other points in time should be presented in the given point in time

  • Entrances/Exits: enter_*()/exit_*() defines how new data should appear and how old data should disappear during the course of the animation

  • Easing: ease_aes() defines how different aesthetics should be eased during transitions

Learn more

Themes

Complete themes

p <- ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) +
  geom_point()

p + theme_gray() + labs(title = "Gray")
p + theme_void() + labs(title = "Void")
p + theme_dark() + labs(title = "Dark")

Themes from ggthemes

library(ggthemes)

p + theme_fivethirtyeight() + labs(title = "FiveThirtyEight")
p + theme_economist() + labs(title = "Economist")
p + theme_wsj() + labs(title = "Wall Street Journal")

Themes and color scales from ggthemes

p + 
  aes(color = species) +
  scale_color_wsj() +
  theme_wsj() + 
  labs(title = "Wall Street Journal")

Modifying theme elements

p + 
  labs(title = "Palmer penguins") +
  theme(
    plot.title = element_text(color = "red", face = "bold", family = "Comic Sans MS"),
    plot.background = element_rect(color = "red", fill = "mistyrose")
  )

Axes

Presidential terms

How can the following figure be improved with custom breaks in axes, if at all?

ggplot(presidential, aes(x = start, xend = end, y = name, yend = name)) +
  geom_segment()
presidential
# A tibble: 12 × 4
   name       start      end        party     
   <chr>      <date>     <date>     <chr>     
 1 Eisenhower 1953-01-20 1961-01-20 Republican
 2 Kennedy    1961-01-20 1963-11-22 Democratic
 3 Johnson    1963-11-22 1969-01-20 Democratic
 4 Nixon      1969-01-20 1974-08-09 Republican
 5 Ford       1974-08-09 1977-01-20 Republican
 6 Carter     1977-01-20 1981-01-20 Democratic
 7 Reagan     1981-01-20 1989-01-20 Republican
 8 Bush       1989-01-20 1993-01-20 Republican
 9 Clinton    1993-01-20 2001-01-20 Democratic
10 Bush       2001-01-20 2009-01-20 Republican
11 Obama      2009-01-20 2017-01-20 Democratic
12 Trump      2017-01-20 2021-01-20 Republican

Context matters: y-axis breaks

presidential |>
  mutate(name = fct_reorder(name, start)) |>
  ggplot(aes(x = start, xend = end, y = name, yend = name)) +
  geom_segment()

Context matters: y-axis breaks

presidential |>
  mutate(
    name = case_when(
      name == "Bush" & year(start) == 1989 ~ "Bush I",
      name == "Bush" & year(start) == 2001 ~ "Bush II",
      .default = name
    ),
    name = fct_reorder(name, start)
  ) |>
  ggplot(aes(x = start, xend = end, y = name, yend = name)) +
  geom_segment()

Context matters: y-axis breaks

Context matters: x-axis breaks

presidential <- presidential |>
  mutate(
    name = case_when(
      name == "Bush" & year(start) == 1989 ~ "Bush I",
      name == "Bush" & year(start) == 2001 ~ "Bush II",
      .default = name
    ),
    name = fct_reorder(name, start),
    start = year(start),
    end = year(end)
  )

ggplot(
  presidential,
  aes(x = start, xend = end, y = name, yend = name)
  ) +
  geom_segment() +
  scale_x_continuous(
    breaks = seq(from = 1952, to = 2020, by = 4),
    minor_breaks = NULL
  )

Context matters: x-axis breaks

Colors matter

ggplot(
  presidential,
  aes(x = start, xend = end, y = name, yend = name,
      color = party)
  ) +
  geom_segment(show.legend = FALSE) +
  scale_x_continuous(
    breaks = seq(from = 1952, to = 2020, by = 4),
    minor_breaks = NULL
  ) +
  scale_color_manual(
    values = c(
      "Democratic" = "blue",
      "Republican" = "red"
    )
  )

Colors matter

Precision matters

ggplot(
  presidential,
  aes(x = start, xend = end, y = name, yend = name, color = party)
  ) +
  geom_segment(show.legend = FALSE) +
  scale_x_continuous(
    breaks = seq(from = 1952, to = 2020, by = 4),
    minor_breaks = NULL
  ) +
  scale_color_manual(
    values = c(
      "Democratic" = "blue",
      "Republican" = "red"
    )
  ) +
  labs(x = "Election year", y = "President")

Precision matters

Annotation

Why annotate?

geom_text()

Can be useful when individual observations are identifiable, but can also get overwhelming…

How would you improve this visualization?

ggplot(state_stats, aes(x = homeownership, y = pop2010)) + 
  geom_point()

ggplot(state_stats, aes(x = homeownership, y = pop2010)) + 
  geom_text(aes(label = abbr))

All of the data doesn’t tell a story

Highlighting in ggplot2

We have (at least) two options:

  1. Native ggplot2 – use layers

  2. gghighlight: https://yutannihilation.github.io/gghighlight/articles/gghighlight.html

Data: SF AQI

sf_files <- fs::dir_ls(here::here("data/san-francisco"))
sf <- read_csv(sf_files, na = c(".", ""))

sf <- sf |>
  janitor::clean_names() |>
  mutate(date = mdy(date)) |>
  arrange(date) |>
  select(date, aqi_value)

sf
# A tibble: 2,557 × 2
   date       aqi_value
   <date>         <dbl>
 1 2016-01-01        32
 2 2016-01-02        37
 3 2016-01-03        45
 4 2016-01-04        33
 5 2016-01-05        27
 6 2016-01-06        39
 7 2016-01-07        39
 8 2016-01-08        31
 9 2016-01-09        20
10 2016-01-10        20
# ℹ 2,547 more rows

Data prep

sf <- sf |>
  mutate(
    year = year(date),
    day_of_year = yday(date)
  )
# check
sf |>
  filter(day_of_year < 3)
# A tibble: 14 × 4
   date       aqi_value  year day_of_year
   <date>         <dbl> <dbl>       <dbl>
 1 2016-01-01        32  2016           1
 2 2016-01-02        37  2016           2
 3 2017-01-01        55  2017           1
 4 2017-01-02        36  2017           2
 5 2018-01-01        87  2018           1
 6 2018-01-02        95  2018           2
 7 2019-01-01        33  2019           1
 8 2019-01-02        50  2019           2
 9 2020-01-01        53  2020           1
10 2020-01-02        43  2020           2
11 2021-01-01        79  2021           1
12 2021-01-02        57  2021           2
13 2022-01-01        53  2022           1
14 2022-01-02        55  2022           2

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line()

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year, color = year)) +
  geom_line()

Plot AQI over years

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year, color = factor(year))) +
  geom_line()

Highlight 2016

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2016), color = "red") +
  labs(
    title = "AQI levels in SF in 2016",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight 2017

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2017), color = "red") +
  labs(
    title = "AQI levels in SF in 2017",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight 2018

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == 2018), color = "red") +
  labs(
    title = "AQI levels in SF in 2018",
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Highlight any year

year_to_highlight <- 2018

ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
  geom_line(color = "gray") +
  geom_line(data = sf |> filter(year == year_to_highlight), color = "red") +
  labs(
    title = glue("AQI levels in SF in {year_to_highlight}"),
    subtitle = "Versus all years 2016 - 2022",
    x = "Day of year", y = "AQI value"
    )

Quarto

Quarto tips

  • Figures and tables

  • Cross references

  • Bibliography

  • Slides