Exercises

  1. R has a function called t() that computes the transpose of a given matrix. This means that it switches around the rows and columns of the matrix, like this:

    myMatrix <- matrix(1:24, nrow = 6)
    myMatrix
    ##      [,1] [,2] [,3] [,4]
    ## [1,]    1    7   13   19
    ## [2,]    2    8   14   20
    ## [3,]    3    9   15   21
    ## [4,]    4   10   16   22
    ## [5,]    5   11   17   23
    ## [6,]    6   12   18   24
    t(myMatrix)
    ##      [,1] [,2] [,3] [,4] [,5] [,6]
    ## [1,]    1    2    3    4    5    6
    ## [2,]    7    8    9   10   11   12
    ## [3,]   13   14   15   16   17   18
    ## [4,]   19   20   21   22   23   24

    Write your own function called transpose() that will perform the same task on any given matrix. The function should take a single parameter called mat, the matrix to be transposed. Of course you may NOT use t() in the code for your function!

    Hints: Your function will have to:

    1. break mat down into the vector of its elements, and then
    2. build the new matrix from those elements, with a number of rows equal to the number of columns of mat.

    For the first task, as.vector() will be useful.

    For the second task, recall (see previous Practice Exercises from this Chapter) that there is a function nrow() that returns the number of rows of a given matrix. It will also be helpful to remember the function ncol() that computes the number of columns of a given matrix.

  2. R has functions called rowSums() and colSums() that will respectively sum the rows and the columns of a matrix. Here is an example:

    myMatrix <- matrix(1:24, nrow = 6)
    rowSums(myMatrix)
    ## [1] 40 44 48 52 56 60

    Your task is to write your own function called dimSum() that will sum either the rows or the columns of a given matrix. The function should have two parameters:

    • mat: the matrix to be summed.
    • dim: the dimension to sum along, either rows or columns. The default value should be "rows". If the user sets dim to "columns" then the function would compute the column-sums.

    You may NOT use rowSums() or colSums() in the code for your function. A typical example of use should look like this:

    myMatrix <- matrix(1:24, nrow = 6)
    dimSum(myMatrix)
    ## [1] 40 44 48 52 56 60
    dimSum(myMatrix, "columns")
    ## [1]  21  57  93 129

    Hint: Recall that in previous Practice Exercises of this Chapter we made a function called myRowSums() that sums the rows of any given matrix. Modify the idea for myRowSums() to write a function called myColSums() that finds the column-sums of any given matrix. You may then use the two previously-created functions to write the required function dimsum().

  3. Starting with m111survey in the bcscr package, write the code necessary to create a new data frame called smaller that consists precisely of the male students who believe in extraterrestrial life and who are more than 68 inches tall. The new data frame should contain all of the original variables except for sex and extra_life.

  4. Write a function called dfRandSelect() that randomly selects (without replacement) a specified number of rows from a given data frame. The function should have two parameters:

    • df: the data frame from which to select;
    • n: the number of rows to select.

    If n is greater than the number of rows in df, the function should return immediately with a message informing the user that the required task is not possible and informing him/her of the number of rows in df. Typical examples of use should be as follows:

    dfRandSelect(bcscr::fuel, 5)
    ##    speed efficiency
    ## 12   120       9.87
    ## 15   150      12.83
    ## 7     70       6.30
    ## 6     60       5.90
    ## 8     80       6.95
    dfRandSelect(bcscr::fuel, 200)
    ## No can do!  The frame has only 15 rows.

    Hint: Use the function nrow(), which gives the number of rows of a matrix or data frame.

  5. (*) Create your own data frame, named myFrame. The frame should have 100 rows, along with the following variables:

    • lowerLetters: a character vector randomly-produced 3-letter strings, like “chj,” “bbw,” and so on. The letters should all be lowercase.
    • height: a numerical vector consisting of real numbers chosen randomly between the values of 60 and 75.
    • sex: a factor whose possible value are “female” and “male.” Again, these values should be chosen randomly.

    A call to str(myFrame) would come out like this (although your results will vary a bit since the vectors are constructed randomly):

    str(myFrame)
    ## 'data.frame':    100 obs. of  3 variables:
    ##  $ lowerLetters: chr  "usu" "uhl" "xyj" "uyd" ...
    ##  $ height      : num  73.7 72.4 73.8 65.2 61.3 ...
    ##  $ sex         : Factor w/ 2 levels "female","male": 1 2 2 2 2 1 2 1 1 1 ...

    summary() is useful when working with data frames. Here is how a call to summary(myFrame) might look:

    summary(myFrame)
    ##  lowerLetters           height          sex    
    ##  Length:100         Min.   :60.00   female:57  
    ##  Class :character   1st Qu.:63.63   male  :43  
    ##  Mode  :character   Median :68.28              
    ##                     Mean   :67.62              
    ##                     3rd Qu.:71.63              
    ##                     Max.   :74.57

    Hint: If you have a vector of three letters, such as

    vec <- c("g", "a", "r")

    then you can paste them together as follows:

    paste0(vec, collapse = "")
    ## [1] "gar"
  6. (*) Study the data frame fuel in the bcscr package. Note that the fuel efficiency is reported as the number of liters of fuel required to travel 100 kilometers. Look up the conversion between gallons and liters and between kilometers and miles, and use this information to create a new variable called mpg that gives the fuel efficiency as miles per gallon. While you are at it, create a new variable mph that gives the speed in miles per hour. Finally, add these new variables to the fuel data frame.

  7. (*) Use matrices to generalize the simulation in the Appeals Court Paradox (see Section 6.5). Your goal is to write a simulation function called appealsSimPlus() that comes with all the options provided in the text, but with additional parameters so that the user can choose:

    • the number of judges on the court;
    • the probability for each judge to make a correct decision;
    • the voting pattern (how many votes each judge gets).

    A typical call to the functions should look like this:

    appealsSimPlus(reps = 10000, seed = 5252, 
                   probs = c(0.95, 0.90, 0.90, 0.90, 0.80),
                   votes = c(2, 1, 1, 1, 0))

    In the above call the court consists of five judges. The best one decides cases correctly 95% of the time, three are right 90% of the time and one is right 80%of the time. The voting arrangement is that the best judge gets two votes, the next three get one vote each, and the worst gets no vote. Any voting scheme—even a scheme involving fractional votes—should be allowed so long as the votes add up to the number of judges.

    Here is a hint. When you write the function it may be helpful to use the fact that rbinom() can take a prob parameter that is a vector of any length. Here’s an example:

    results <- rbinom(6, size = 100, prob = c(0.10, 0.50, 0.90))
    results
    ## [1] 20 49 94 15 50 88

    The first and fourth entries simulate a person tossing a fair coin 100 times when she has only a 10% chance of heads. The second and fifth entries simulate the same, when the chance of heads is 50%. The third and sixth simulate coin-tossing when there is a 90% chance of heads.

    If you would like to arrange the results more nicely—say in a matrix where each column gives the results for a different person—you can do so:

    resultsMat <- matrix(results, ncol = 3, byrow = TRUE)
    resultsMat
    ##      [,1] [,2] [,3]
    ## [1,]   20   49   94
    ## [2,]   15   50   88

    Of course judges don’t flip a coin 100 times, they decide one case at a time. Suppose you have five judges with probabilities as follows:

    probCorrect <- c(0.95, 0.90, 0.90, 0.90, 0.80)

    If you would like to simulate the judges deciding, say, 6 cases, try this:

    results <- rbinom(5*6, size = 1, prob= rep(probCorrect, 6))
    resultsMat <- matrix(results, nrow = 6, byrow = TRUE)
    resultsMat
    ##      [,1] [,2] [,3] [,4] [,5]
    ## [1,]    1    1    1    0    1
    ## [2,]    0    1    1    1    1
    ## [3,]    1    1    1    1    1
    ## [4,]    1    1    1    1    1
    ## [5,]    1    1    1    1    1
    ## [6,]    1    1    1    1    0

    When it comes to applying the voting pattern to compute the decision in each case, consider matrix multiplication. For example, suppose that the pattern is:

    votes <- c(2, 1, 1, 1, 0)

    Then make votes a one-column matrix and perform matrix multiplication:

    correctVotes <- resultsMat %*% matrix(votes, nrow = 5)
    correctVotes
    ##      [,1]
    ## [1,]    4
    ## [2,]    3
    ## [3,]    5
    ## [4,]    5
    ## [5,]    5
    ## [6,]    5

    Think about how to encapsulate all of this into a nice, general simulation function.