Home Page
1) Using a for loop, write a function to calculate the number of zeroes in a numeric vector. Before entering the loop, set up a counter variable counter <- 0. Inside the loop, add 1 to counter each time you have a zero in the matrix. Finally, use return(counter) for the output.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.2
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.5 v dplyr 1.0.7
## v tidyr 1.1.4 v stringr 1.4.0
## v readr 2.0.2 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(data.table)
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
myvec <- c(1,0,3,0,6,0,3,7,2,0,3,5,8,0,33,0,5,2,6,0,0,0,3,4,7,8,5,0,0,6)
##############################
# FUNCTION: loop_for_0_HW10
# purpose: loop through a vector and count the number of zeros
# input: numeric vector
# output: Number of 0 in the vector
# ------------------------------------------
loop_for_0_HW10 <- function(x) {
counter <- 0
for (i in 1:length(myvec)) {
if(myvec[i] == 0) {
counter <- counter+1 }
}
return(c("Number of 0's in your vector = ", counter))
}
loop_for_0_HW10(myvec)
## [1] "Number of 0's in your vector = " "11"
2) Use subsetting instead of a loop to rewrite the function as a single line of code.
##############################
# FUNCTION: subset_for_0_HW10
# purpose: Subset a vector and count the number of zeros
# input: numeric vector
# output: Number of 0 in the vector
# ------------------------------------------
subset_for_0_HW10 <- function(x) {
zeros <- subset(x,x[]==0)
return(length(zeros))
}
subset_for_0_HW10(myvec)
## [1] 11
4)In the next few lectures, you will learn how to do a randomization test on your data. We will complete some of the steps today to practice calling custom functions within a for loop. Use the code from the March 31st lecture (Randomization Tests) to complete the following steps
A) Simulate a dataset with 3 groups of data, each group drawn from a distribution with a different mean. The final data frame should have 1 column for group and 1 column for the response variable.
G1 <- rnorm(n=10, mean=5, sd=1)
G2 <- rnorm(n=10, mean=7, sd=1)
G3 <- rnorm(n=10, mean=3, sd=1)
df <- data.frame(G1,G2,G3)
df_long <- pivot_longer(df,cols= 1:3, names_to = "Group", values_to = "Response_Var")
df_long
## # A tibble: 30 x 2
## Group Response_Var
## <chr> <dbl>
## 1 G1 4.71
## 2 G2 7.92
## 3 G3 3.34
## 4 G1 5.41
## 5 G2 6.95
## 6 G3 3.38
## 7 G1 5.86
## 8 G2 6.73
## 9 G3 2.67
## 10 G1 4.98
## # ... with 20 more rows
B)Write a custom function that 1) reshuffles the response variable, and 2) calculates the mean of each group in the reshuffled data. Store the means in a vector of length 3.
##############################
# FUNCTION: shuffle
# purpose: 1) reshuffles the response variable, and 2) calculates the mean of each group in the reshuffled data. Store the means in a vector of length 3.
# input: Data Frame
# output:
# ------------------------------------------
shuffle <- function(df) {
df_temp <- df_long
df_temp$Response_Var_shffle <- sample(df_temp$Response_Var, replace = FALSE)
setDT(df_temp)
means <- df_temp[ ,list(mean=mean(Response_Var_shffle)), by=Group]
means_list <- as.numeric(c(means[1,2],means[2,2],means[3,2]))
return(means_list)
}
shuffle(df_long)
## [1] 5.075201 5.462703 5.081489
C) Use a for loop to repeat the function in b 100 times. Store the results in a data frame that has 1 column indicating the replicate number and 1 column for each new group mean, for a total of 4 columns.
df_shuffle_100 <- data.frame(replicate = rep(NA,100),
means_G1 = rep(NA,100),
mean_G2 = rep(NA,100),
mean_G3 = rep(NA,100))
for (i in 1:100) {
means <- shuffle(df = df_long) # run randomization_test() function
df_shuffle_100$replicate[i] <- i # fill in replicate column
df_shuffle_100$means_G1[i] <- means[1]
df_shuffle_100$mean_G2[i] <- means[2]
df_shuffle_100$mean_G3[i] <- means[3]
} # end of for loop
head(df_shuffle_100)
## replicate means_G1 mean_G2 mean_G3
## 1 1 4.191441 6.438086 4.989866
## 2 2 4.637633 4.944126 6.037634
## 3 3 4.561880 5.594946 5.462568
## 4 4 4.844076 5.835348 4.939970
## 5 5 5.189249 4.019154 6.410990
## 6 6 4.747870 5.005089 5.866435