= data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
example_data Greeting = c(rep("Hello", 5), rep("Goodbye",5)),
Male = rep(c(TRUE, FALSE), 5),
Weight = runif(n=10, 50, 300))
Practice Problems 1
Problem 1
- Run the following chunk. Comment on the output.
Click for answer
example_data
ID Greeting Male Weight
1 1 Hello TRUE 255.61071
2 2 Hello FALSE 232.02432
3 3 Hello TRUE 206.48462
4 4 Hello FALSE 184.51848
5 5 Hello TRUE 92.40136
6 6 Goodbye FALSE 65.14615
7 7 Goodbye TRUE 261.19309
8 8 Goodbye FALSE 257.97052
9 9 Goodbye TRUE 245.32082
10 10 Goodbye FALSE 149.85527
identifier
for the cases. We have information on the greeting types, whether male or not, and weight on these cases in the remaining columns.
- What is the dimension of the dataset called ‘example_data’?
Click for answer
dim(example_data)
## [1] 10 4
nrow(example_data)
## [1] 10
ncol(example_data)
## [1] 4
Problem 2
- Read the dataset
EducationLiteracy
from the Lock5 second edition book.
Click for answer
# read in the data
library(readr)
<- read_csv("https://www.lock5stat.com/datasets2e/EducationLiteracy.csv") education_lock5
- Print the header (i.e. first 6 cases by default) of the dataset in part a.
Click for answer
head(education_lock5)
# A tibble: 6 × 3
Country EducationExpenditure Literacy
<chr> <dbl> <dbl>
1 Afghanistan 3.1 31.7
2 Albania 3.2 96.8
3 Algeria 4.3 NA
4 Andorra 3.2 NA
5 Angola 3.5 70.6
6 Antigua and Barbuda 2.6 99
- What is the dimension of the dataset in a?
Click for answer
dim(education_lock5)
[1] 188 3
Answer: There are 188 rows and 3 columns.
- What type of variables are
Country
,EducationExpenditure
, andLiteracy
?
Click for answer
Answer: Country
is a categorical variable. EducationExpenditure
and Literacy
are both quantitative variables.
- If we would like to use education expenditure to predict the literacy rate of each countries, which variable is the explanatory variable and which one is the response?