library(tidyverse)
library(tidymodels)
library(openintro)
Yawners
In this application exercise, we’ll introduce pipelines for conducting hypothesis tests with randomization.
Packages and data
We’ll use the tidyverse and tidymodels packages as usual and the openintro package for the datasets.
Using the yawn
dataset in the openintro package, conduct a hypothesis test for evaluating whether yawning is contagious. First, set the hypotheses. Then, conduct a randomization test using 1000 simulations. Visualize and calculate the p-value and use it to make a conclusion about the statistical discernability of the difference in proportions of yawners in the two groups. Then, comment on whether you “buy” this conclusion.
Exercise 1
Simulate a single difference in proportions assuming the null hypothesis (\(p_{treatment} = p_{control}\)) is true. Record the value.
set.seed(1234)
|>
yawn specify(response = result, explanatory = group, success = "yawn") |>
hypothesize(null = "independence") |>
generate(reps = 1, type = "permute") |>
calculate(stat = "diff in props", order = c("trmt", "ctrl"))
Response: result (factor)
Explanatory: group (factor)
Null Hypothesis: independence
# A tibble: 1 × 1
stat
<dbl>
1 0.0441
Exercise 2
Simulate one more difference in proportions assuming the null hypothesis (\(p_{treatment} = p_{control}\)) is true. Record the value.
# add code here
Exercise 3
Construct the null distribution with 100 resamples. Name it null_dist
. How many rows does null_dist
have? How many columns? What does each row and each column represent?
# add code here
Add response here.
Exercise 4
Where do you expect the center of the null distribution to be? Visualize it to confirm. First, make a dot plot, coloring each simulated statistic differently. Then, make a histogram.
# add code here
# option 1
# add code here
# option 2
# add code here
Exercise 5
Calculate the observed difference in proportions. Name it obs_stat
.
# add code here
Exercise 6
Overlay the observed statistic on the null distribution and comment on whether an observed outcome as extreme as the observed statistic, or lower, is a likely or unlikely outcome, if in fact the null hypothesis is true.
# option 1
# add code here
# option 2
# add code here
Exercise 7
Calculate the p-value and comment on whether it provides convincing evidence that yawning is contagious.
Add response here.
# option 1
# add code here
# option 2
# add code here
Exercise 8
Let’s get real! Redo the test with 10,000 simulations. Note: This can take some time to run.
# add code here
Exercise 9
Use the p-value to make a conclusion about the statistical discernability of the difference in proportions of yawners in the two groups. Then, comment on whether you “buy” this conclusion.
Add response here.