Practice Problems 1

Problem 1

Run the following chunk. Comment on the output.

example_data = data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                          Greeting = c(rep("Hello", 5), rep("Goodbye",5)),
                          Male = rep(c(TRUE, FALSE), 5),
                          Weight = runif(n=10, 50, 300))

Click for answer

example_data

   ID Greeting  Male    Weight
1   1    Hello  TRUE 255.61071
2   2    Hello FALSE 232.02432
3   3    Hello  TRUE 206.48462
4   4    Hello FALSE 184.51848
5   5    Hello  TRUE  92.40136
6   6  Goodbye FALSE  65.14615
7   7  Goodbye  TRUE 261.19309
8   8  Goodbye FALSE 257.97052
9   9  Goodbye  TRUE 245.32082
10 10  Goodbye FALSE 149.85527

Answer: We see a data frame with four columns, where the first column is an identifier for the cases. We have information on the greeting types, whether male or not, and weight on these cases in the remaining columns.

What is the dimension of the dataset called ‘example_data’?

Click for answer

dim(example_data)
## [1] 10  4
nrow(example_data)
## [1] 10
ncol(example_data)
## [1] 4

Answer: There are 10 rows and 4 columns.

Problem 2

Read the dataset EducationLiteracy from the Lock5 second edition book.

Click for answer

# read in the data
library(readr)
education_lock5 <- read_csv("https://www.lock5stat.com/datasets2e/EducationLiteracy.csv")

Print the header (i.e. first 6 cases by default) of the dataset in part a.

Click for answer

head(education_lock5)

# A tibble: 6 × 3
  Country             EducationExpenditure Literacy
  <chr>                              <dbl>    <dbl>
1 Afghanistan                          3.1     31.7
2 Albania                              3.2     96.8
3 Algeria                              4.3     NA  
4 Andorra                              3.2     NA  
5 Angola                               3.5     70.6
6 Antigua and Barbuda                  2.6     99

What is the dimension of the dataset in a?

Click for answer

dim(education_lock5)

[1] 188   3

Answer: There are 188 rows and 3 columns.

What type of variables are Country, EducationExpenditure, and Literacy?

Click for answer

Answer: Country is a categorical variable. EducationExpenditure and Literacy are both quantitative variables.

If we would like to use education expenditure to predict the literacy rate of each countries, which variable is the explanatory variable and which one is the response?

Click for answer

Answer: The education expenditure is the explanatory variable, and the literacy rate is the response.