Home Page

Working with Matrices, Lists, and Data Frames

1. Assign to the variable n_dims a single random integer between 3 and 10.

# floor rounds down to nearest integer, so setting from 3 to 11 will return a 10 if it is 10.999
n_dims <- floor(runif(1,3,11))
n_dims
## [1] 3
  • Create a vector of consecutive integers from 1 to n_dims2
myvec <- seq(1,(n_dims)^2)
myvec
## [1] 1 2 3 4 5 6 7 8 9
  • Use the sample function to randomly reshuffle these values.
myvec <- sample(myvec)
myvec
## [1] 9 1 7 3 5 8 2 6 4
  • create a square matrix with these elements.
m <- matrix(myvec, nrow=sqrt(length(myvec)))
m
##      [,1] [,2] [,3]
## [1,]    9    3    2
## [2,]    1    5    6
## [3,]    7    8    4
  • find a function in r to transpose the matrix.
m <- t(m)
m
##      [,1] [,2] [,3]
## [1,]    9    1    7
## [2,]    3    5    8
## [3,]    2    6    4
  • calculate the sum and the mean of the elements in the first row and the last row.
sum(m[1,])
## [1] 17
sum(m[-1,])
## [1] 28
mean(m[1,])
## [1] 5.666667
mean(m[-1,])
## [1] 4.666667
  • read about the eigen() function and use it on your matrix
eigen_m <- eigen(m)
  • look carefully at the elements of $values and $vectors. What kind of numbers are these?
    • $values is the mathematical variance of a metrix and how symmetrical the matrix is.
    • $vectors is a special set of scalars associated with a linear system of equations. a vector whose direction remains unchanged when a linear transformation is applied to it.
  • dig in with the typeof() function to figure out their type.
typeof(eigen_m$values)
## [1] "double"
typeof(eigen_m$vectors)
## [1] "double"
# typeof returns "complex" for each of these values which are also labled as 'doubles'
  • if have set your code up properly, you should be able to re-run it and create a matrix of different size because n_dims will change.

2. Create a list with the following named elements:

  • mymatrix, which is a 4 x 4 matrix filled with random uniform values
  • mylogical which is a 100-element vector of TRUE or FALSE values. Do this efficiently by setting up a vector of random values and then applying an inequality to it.
  • my_letters, which is a 26-element vector of all the lower-case letters in random order.
mylist <- list(mymatrix = matrix(data = runif(16), ncol = 4), 
            mylogical = c(runif(100) < 0.5),
            myletters = letters[1:26])
  • create a new list, which has the element[2,2] from the matrix, the second element of the logical vector, and the second element of the letters vector.
newlist <- list(mylist$mymatrix[2,2],
                mylist$mylogical[2],
                mylist$myletters[2])
newlist
## [[1]]
## [1] 0.9025716
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] "b"
  • use the typeof() function to confirm the underlying data types of each component in this list
typeof(newlist[[1]])
## [1] "double"
typeof(newlist[[2]])
## [1] "logical"
typeof(newlist[[3]])
## [1] "character"
  • combine the underlying elements from the new list into a single atomic vector with the c() function.
newvec <- c(unlist(newlist))
newvec
## [1] "0.902571638813242" "TRUE"              "b"
  • what is the data type of this vector?
typeof(newvec)
## [1] "character"

3. Create a data frame with two variables (= columns) and 26 cases (= rows).

  • call the first variable my_unis and fill it with 26 random uniform values from 0 to 10
  • call the second variable my_letters and fill it with 26 capital letters in random order.
myunits <- runif(26,0, 10)
my_letters <- sample(LETTERS[1:26]) 
mydata <- data.frame(myunits, my_letters)
mydata
##      myunits my_letters
## 1  8.8610031          O
## 2  9.7938322          W
## 3  9.1802644          Y
## 4  3.9839185          N
## 5  4.2179669          K
## 6  5.4920706          A
## 7  7.1201427          S
## 8  1.2730415          C
## 9  3.6511823          Z
## 10 2.1633328          D
## 11 0.8091063          R
## 12 7.7666673          P
## 13 4.6999839          U
## 14 5.3893999          I
## 15 3.6704946          V
## 16 8.8476281          L
## 17 4.1933916          E
## 18 3.2968438          M
## 19 9.3185131          J
## 20 1.9981771          G
## 21 9.3375367          B
## 22 4.0843477          H
## 23 3.8818783          F
## 24 3.4493561          T
## 25 6.7252571          X
## 26 6.2594360          Q
  • for the first variable, use a single line of code in R to select 4 random rows and replace the numerical values in those rows with NA.
mydata$myunits[sample(1:26, 4)] <- NA
  • for the first variable, write a single line of R code to identify which rows have the missing values.
which(is.na(mydata$myunits))
## [1] 12 19 20 21
  • for the second variable, sort it in alphabetical order
mydata <- mydata[order(mydata$my_letters),]
mydata
##      myunits my_letters
## 6  5.4920706          A
## 21        NA          B
## 8  1.2730415          C
## 10 2.1633328          D
## 17 4.1933916          E
## 23 3.8818783          F
## 20        NA          G
## 22 4.0843477          H
## 14 5.3893999          I
## 19        NA          J
## 5  4.2179669          K
## 16 8.8476281          L
## 18 3.2968438          M
## 4  3.9839185          N
## 1  8.8610031          O
## 12        NA          P
## 26 6.2594360          Q
## 11 0.8091063          R
## 7  7.1201427          S
## 24 3.4493561          T
## 13 4.6999839          U
## 15 3.6704946          V
## 2  9.7938322          W
## 25 6.7252571          X
## 3  9.1802644          Y
## 9  3.6511823          Z
  • calculate the column mean for the first variable.
summary(mydata$myunits)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.8091  3.6560  4.2057  5.0474  6.6088  9.7938       4
#or
mean(mydata$myunits, na.rm = TRUE)
## [1] 5.047449