library(tidyverse)
library(tidymodels)
library(openintro)Yawners
In this application exercise, we’ll introduce pipelines for conducting hypothesis tests with randomization.
Packages and data
We’ll use the tidyverse and tidymodels packages as usual and the openintro package for the datasets.
Using the yawn dataset in the openintro package, conduct a hypothesis test for evaluating whether yawning is contagious. First, set the hypotheses. Then, conduct a randomization test using 1000 simulations. Visualize and calculate the p-value and use it to make a conclusion about the statistical discernability of the difference in proportions of yawners in the two groups. Then, comment on whether you “buy” this conclusion.
Exercise 1
Simulate a single difference in proportions assuming the null hypothesis (\(p_{treatment} = p_{control}\)) is true. Record the value.
set.seed(1234)
yawn |>
specify(response = result, explanatory = group, success = "yawn") |>
hypothesize(null = "independence") |>
generate(reps = 1, type = "permute") |>
calculate(stat = "diff in props", order = c("trmt", "ctrl"))Response: result (factor)
Explanatory: group (factor)
Null Hypothesis: independence
# A tibble: 1 × 1
stat
<dbl>
1 0.0441
Exercise 2
Simulate one more difference in proportions assuming the null hypothesis (\(p_{treatment} = p_{control}\)) is true. Record the value.
# add code hereExercise 3
Construct the null distribution with 100 resamples. Name it null_dist. How many rows does null_dist have? How many columns? What does each row and each column represent?
# add code hereAdd response here.
Exercise 4
Where do you expect the center of the null distribution to be? Visualize it to confirm. First, make a dot plot, coloring each simulated statistic differently. Then, make a histogram.
# add code here# option 1
# add code here
# option 2
# add code hereExercise 5
Calculate the observed difference in proportions. Name it obs_stat.
# add code hereExercise 6
Overlay the observed statistic on the null distribution and comment on whether an observed outcome as extreme as the observed statistic, or lower, is a likely or unlikely outcome, if in fact the null hypothesis is true.
# option 1
# add code here
# option 2
# add code hereExercise 7
Calculate the p-value and comment on whether it provides convincing evidence that yawning is contagious.
Add response here.
# option 1
# add code here
# option 2
# add code hereExercise 8
Let’s get real! Redo the test with 10,000 simulations. Note: This can take some time to run.
# add code hereExercise 9
Use the p-value to make a conclusion about the statistical discernability of the difference in proportions of yawners in the two groups. Then, comment on whether you “buy” this conclusion.
Add response here.