25:00
Lecture 19
Duke University
STA 113 - Fall 2023
Project 2 presentations in class on Thursday – food preference?
Any questions?
exam-2-redo.qmd
, this is a copy of your exam submission, without any changes I might have implemented to get it to render – do not overwrite exam-2.qmd
.25:00
The purpose of interactivity is to display more than can be achieved with persistent plot elements, and to invite the reader to engage with the plot.
Animation allows more information to be displayed, but developer keeps control
Beware that it is easy to forget what was just displayed, so keeping some elements persistent, maybe faint, can be useful for the reader
gganimate extends the grammar of graphics as implemented by ggplot2 to include the description of animation
It provides a range of new grammar classes that can be added to the plot object in order to customize how it should change with time
Start with a ggplot2 specification
Add layers with graphical primitives (geoms)
Add formatting specification
Add animation specification
Transitions: transition_*()
defines how the data should be spread out and how it relates to itself across time
Views: view_*()
defines how the positional scales should change along the animation
Shadows: shadow_*()
defines how data from other points in time should be presented in the given point in time
Entrances/Exits: enter_*()
/exit_*()
defines how new data should appear and how old data should disappear during the course of the animation
Easing: ease_aes()
defines how different aesthetics should be eased during transitions
STA 313 lecture on animation: https://vizdata.org/slides/23/23-animation#/animation
gganimate vignette: https://gganimate.com/articles/gganimate.html
How can the following figure be improved with custom breaks in axes, if at all?
# A tibble: 12 × 4
name start end party
<chr> <date> <date> <chr>
1 Eisenhower 1953-01-20 1961-01-20 Republican
2 Kennedy 1961-01-20 1963-11-22 Democratic
3 Johnson 1963-11-22 1969-01-20 Democratic
4 Nixon 1969-01-20 1974-08-09 Republican
5 Ford 1974-08-09 1977-01-20 Republican
6 Carter 1977-01-20 1981-01-20 Democratic
7 Reagan 1981-01-20 1989-01-20 Republican
8 Bush 1989-01-20 1993-01-20 Republican
9 Clinton 1993-01-20 2001-01-20 Democratic
10 Bush 2001-01-20 2009-01-20 Republican
11 Obama 2009-01-20 2017-01-20 Democratic
12 Trump 2017-01-20 2021-01-20 Republican
presidential <- presidential |>
mutate(
name = case_when(
name == "Bush" & year(start) == 1989 ~ "Bush I",
name == "Bush" & year(start) == 2001 ~ "Bush II",
.default = name
),
name = fct_reorder(name, start),
start = year(start),
end = year(end)
)
ggplot(
presidential,
aes(x = start, xend = end, y = name, yend = name)
) +
geom_segment() +
scale_x_continuous(
breaks = seq(from = 1952, to = 2020, by = 4),
minor_breaks = NULL
)
ggplot(
presidential,
aes(x = start, xend = end, y = name, yend = name, color = party)
) +
geom_segment(show.legend = FALSE) +
scale_x_continuous(
breaks = seq(from = 1952, to = 2020, by = 4),
minor_breaks = NULL
) +
scale_color_manual(
values = c(
"Democratic" = "blue",
"Republican" = "red"
)
) +
labs(x = "Election year", y = "President")
geom_text()
Can be useful when individual observations are identifiable, but can also get overwhelming…
How would you improve this visualization?
We have (at least) two options:
Native ggplot2 – use layers
gghighlight: https://yutannihilation.github.io/gghighlight/articles/gghighlight.html
sf <- read_csv(sf_files, na = c(".", ""))
sf <- sf |>
janitor::clean_names() |>
mutate(date = mdy(date)) |>
arrange(date) |>
select(date, aqi_value)
sf
# A tibble: 2,557 × 2
date aqi_value
<date> <dbl>
1 2016-01-01 32
2 2016-01-02 37
3 2016-01-03 45
4 2016-01-04 33
5 2016-01-05 27
6 2016-01-06 39
7 2016-01-07 39
8 2016-01-08 31
9 2016-01-09 20
10 2016-01-10 20
# ℹ 2,547 more rows
# A tibble: 14 × 4
date aqi_value year day_of_year
<date> <dbl> <dbl> <dbl>
1 2016-01-01 32 2016 1
2 2016-01-02 37 2016 2
3 2017-01-01 55 2017 1
4 2017-01-02 36 2017 2
5 2018-01-01 87 2018 1
6 2018-01-02 95 2018 2
7 2019-01-01 33 2019 1
8 2019-01-02 50 2019 2
9 2020-01-01 53 2020 1
10 2020-01-02 43 2020 2
11 2021-01-01 79 2021 1
12 2021-01-02 57 2021 2
13 2022-01-01 53 2022 1
14 2022-01-02 55 2022 2
year_to_highlight <- 2018
ggplot(sf, aes(x = day_of_year, y = aqi_value, group = year)) +
geom_line(color = "gray") +
geom_line(data = sf |> filter(year == year_to_highlight), color = "red") +
labs(
title = glue("AQI levels in SF in {year_to_highlight}"),
subtitle = "Versus all years 2016 - 2022",
x = "Day of year", y = "AQI value"
)
Figures and tables
Cross references
Bibliography
Slides