<- read.csv("http://math.carleton.edu/Stats215/Textbook/SleepCaffeine.csv")
SleepCaffeine # Create a boxplot using ggplot2
library(ggplot2)
ggplot(SleepCaffeine, aes(x = Group, y = Words)) +
geom_boxplot() +
labs(title = "Boxplot of Words by Group")
Practice Problems 12
Problem 1 Revisited:
(a). If the results of a test for ESP are statistically discernible, what does that mean in terms of ESP?
Click for answer
Answer:
It means we can conclude that \(p > 0.2\) and that the sample results were so strong that we can conclude that ESP does exist and get more right than would be expected by random chance.
Click for answer
Answer:
The sample results are inconclusive. People may or may not have ESP. Sample results could be just random chance.Problem 2: Sleep or Caffeine for Memory
In an experiment, 24 students were given words to memorize, then were randomly assigned to take a 90 minute nap or take a caffeine pill (12 in each group). They were then tested on their recall ability. We test to see if the sample provides evidence that there is a difference in mean number of words people can recall depending on whether they take a nap or have some caffeine. The hypotheses are:
\[ H_0: \mu_S - \mu_C = 0 \ \ H_A: \mu_S - \mu_C \neq 0 \]
The sample mean difference is \(\bar{x}_S - \bar{x}_C = 3\). We want to know if this difference in sample means is statistically discernible.
(a) Explain how to generate a randomization distribution for \(\bar{x}_S - \bar{x}_C\) that is consistent with \(H_0: \mu_S - \mu_C = 0\).
Click for answer
Answer: We could randomly reassign the treatment to the study participants since, under the null, their recall abilities would be the same under either treatment. For each reassignment, we recomputed the sample mean difference and plot it in the dotplot shown below
(c) Compute the randomization p-value
Select the Two-Tail button at the top of the plot. Change the positive x-axis value to the observed difference of 3.0. The p-value is 2 times the proportion of resamples that have a difference of 3 or above. What is the p-value?
Click for answer
Answer: We see in the image that the proportion in the tail beyond the sample statistic of 3.0 is 0.022. Because this is a two-tail test, we have to account for both tails, so the p-value is 2(0.022) = 0.044.
(d) Interpret + Conclusion
Interpret the p-value. Does the p-value support the alternative hypothesis (do you think difference of means of 3 is statistically discernible) or is it inconclusive? Explain.
Click for answer
Answer: We would see a difference of at least 3 words recalled, on average, in about 4.4% of all possible samples if the influence of sleep and caffeine on recall was the same The results show some evidence of statistical significance, meaning that the caffeine and sleep may have some difference effects on word recall ability.
(e) Redo in Rstudio
First get the data from the Lock website and check important summary stats:
# Summary statistics using dplyr for 'Words'
library(dplyr)
%>%
SleepCaffeine group_by(Group) %>%
summarize(
min = min(Words, na.rm = TRUE),
q1 = quantile(Words, 0.25, na.rm = TRUE),
median = median(Words, na.rm = TRUE),
mean = mean(Words, na.rm = TRUE),
q3 = quantile(Words, 0.75, na.rm = TRUE),
max = max(Words, na.rm = TRUE),
sd = sd(Words, na.rm = TRUE)
%>% knitr::kable(caption="Summary Statistics of Words Recalled by Treatment Group") )
Group | min | q1 | median | mean | q3 | max | sd |
---|---|---|---|---|---|---|---|
Caffeine | 6 | 10.00 | 12.5 | 12.25 | 14.25 | 18 | 3.545163 |
Sleep | 9 | 13.75 | 15.5 | 15.25 | 17.25 | 21 | 3.306330 |
Then load the CarletonStats
package and run the permTest(y ~ x, data=)
command where y
is your quantitative (or 0/1 coded) response and x
defines the two groups you are comparing.
set.seed(123)
library(CarletonStats)
permTest(Words ~ Group, data=SleepCaffeine)
** Permutation test **
Permutation test with alternative: two.sided
Observed statistic
Caffeine : 12.25 Sleep : 15.25
Observed difference: -3
Mean of permutation distribution: 0.01573
Standard error of permutation distribution: 1.49817
P-value: 0.0492
*-------------*
- Why is the observed difference reported as -3?
Click for answer
Answer: The difference is computed alphabetically: Caffeine minus Sleep so the difference in now -3 instead of +3.
- What is the p-value? Is it the same as the Statkey p-value? The same as your neighbors p-value? Why not?
Click for answer
Answer: The p-value is around 5%. Any difference between Statkey, neighbors or different runs of the permTest
command stem from the fact that different resamples are obtained each time a randomization distribution is generated. There may be some small (inconsequential) difference in p-values due to this.
Problem 3: Resident vs Non-resident Tuition
The lab manual data set Tuition2006
is a random sample of state colleges and universities in the U.S. We want to know if the average tuition charged to non-residents is higher than residents for all state colleges and universities:
\[ H_0: \mu_{Non-res} - \mu_{Res} = 0 \ \ H_A: \mu_{Non-res} - \mu_{Res} > 0 \]
(a) Paired Data
Read in the data. Note that each case (school) has a response value for the resident and non-resident tuition variables. This makes this a paired data example. Contrast this with the word recall example in which each case (student) only had one response (word recall) and treatment (caffeine/sleep).
<- read.csv("http://math.carleton.edu/Stats215/RLabManual/Tuition2006.csv")
tuition head(tuition)
X Institution Res NonRes Diff
1 1 Univ of Akron (OH) 4200 8800 -4600
2 2 Athens State (AL) 1900 3600 -1700
3 3 Ball State (IN) 3400 8600 -5200
4 4 Bloomsburg U (PA) 3200 7000 -3800
5 5 UC Irvine (CA) 3400 12700 -9300
6 6 Central State (OH) 2600 5700 -3100
(b) Permutation test for paired data
Let’s compute the difference of non-resident and resident tuition (NR minus R):
library(dplyr)
<- tuition %>%
tuition mutate(diff = NonRes - Res)
# Summary statistics of 'diff'
%>%
tuition summarize(
min = min(diff, na.rm = TRUE),
q1 = quantile(diff, 0.25, na.rm = TRUE),
median = median(diff, na.rm = TRUE),
mean = mean(diff, na.rm = TRUE),
q3 = quantile(diff, 0.75, na.rm = TRUE),
max = max(diff, na.rm = TRUE),
sd = sd(diff, na.rm = TRUE)
%>% knitr::kable(caption="Summary Statistics of Difference in Residential and Non-residential Tuition") )
min | q1 | median | mean | q3 | max | sd |
---|---|---|---|---|---|---|
200 | 2650 | 3100 | 3584.211 | 4500 | 9300 | 2073.447 |
# Histogram of 'diff'
ggplot(tuition, aes(x = diff)) +
geom_histogram(binwidth = 1300, fill = "turquoise", color = "black") +
labs(title = "Histogram of Tuition Differences", x = "Difference (NonRes - Res)", y = "Frequency")
- What is the average difference in tuition costs?
Click for answer
Answer: The observed mean difference is $3584
- Is this observed mean difference statistically significant? To test use the command
permTestPaired
:
set.seed(123)
permTestPaired(NonRes ~ Res,data = tuition, alt = "greater")
** Permutation test **
Permutation test with alternative: greater
Observed mean
NonRes : 2821.053 Res : 6405.263
Observed difference: 3584.211
Mean of permutation distribution: 4.69679
Standard error of permutation distribution: 950.3981
P-value: 1e-04
*-------------*
The alt
of greater
was used because the function permTestPaired(A ~ B)
computes paired differences as “A” minus “B”.
- What is the p-value for this test?
Click for answer
Answer: Less than 0.0001
- Is this observed mean difference statistically significant?
Click for answer
Answer: Yes, an observed mean difference of at least $3584 would rarely occur just by chance which provides us strong evidence that the mean tuition amount of non-residents is higher than residents in the population of state colleges and universities (in 2006).
Problem 4: Evaluating Drugs to Fight Cocaine Addition
In a randomized experiment on treating cocaine addiction, 48 cocaine addicts who were trying to quit were randomly assigned to take either desipramine (a new drug), or Lithium (an existing drug). The response variable is whether or not the person relapsed (which means the person was unable to break out of the cycle of addiction and returned to using cocaine.) We are testing to see if desipramine is better than lithium at treating cocaine addiction. The results are shown in the two-way table.
\ | Relapse | No Relapse | total |
---|---|---|---|
Desipramine | 10 | 14 | 24 |
Lithium | 18 | 6 | 24 |
(a) Using \(p_D\) for the true proportion of desipramine users who relapse and \(p_L\) for the true proportion of lithium users who relapse, write the null and alternative hypotheses.
Click for answer
Answer: \(H_0: p_D - p_L =0\) vs. \(H_A: p_D - p_L < 0\)
(b) Compute the appropriate sample statistic needed to assess the hypotheses above.
Click for answer
Answer: We see that \(\hat{p}_D = \dfrac{10}{24} = 0.417\) and \(\hat{p}_L = \dfrac{18}{24} = 0.75\) so we have \(\hat{p}_D -\hat{p}_L = 0.417 - 0.75 = -0.333\). Be sure to compute the difference since we need one number (observed difference) to test the hypotheses, not two separate numbers. You could also compute the difference as \(L-D\) and get +0.333.
(c) How might we compute a randomization sample for this data?
Click for answer
Answer: Since drug doesn’t matter, we combine all 48 patients together and see that 28 relapsed and 20 didn’t. To see what happens by random chance, we randomly divide them into two groups and compute the difference in proportions of relapses between the two groups. The difference in proportions is the statistic.
(e) Compute and interpret the p-value for this test.
Click for answer
Answer: This is a left-tail test when computing the difference as D - L, and we see on StatKey that the p-value (proportion of randomization samples with a difference -.333 or smaller) is about 2 (Figure 1). About 2% of the time we would see at least 33% fewer relapse cases using despramine than lithium just due to chance if there was no difference in the relapse rates of the two treatments.
Note the two key features of this “in context” interpretation of 2%: it assumes that the null is true (no treatment difference) and it uses the observed statistic (data) used to compute the p-value (rate of despramine relapse is .33 below the rate of lithium).
(f) Make a formal decision (reject or not) using a 5% significance level, then restate your conclusion in context for the problem (do not use words like “reject” or “hypothesis”).
Click for answer
Answer: We reject the null hypothesis since the p-value of 2% is less than 5%. We can conclude that despramine is better at helping people kick the cocaine habit. Note the “in context” conclusion: Just state your conclusion in english, no need to talk about the value of the p-value or “just by chance.”
(h) Use Statkey to compute and interpret a 95% bootstrap confidence interval for the difference in the relapse proportion for the two treatments. Explain how this CI agrees with your test conclusion in (f).
Click for answer
Answer: I am 95% confident that the relapse rate for despramine will be between 8.3 to 58.3 percent less than the relapse rate for lithium. This completely agrees with the test conclusion that despramine is a better treatment for cocaine addiction. (Figure 2 shows the bootstrap distribution that is centered at the sample difference of -0.333.)