Exercises

Explain in words what the following line of code produces when given a numerical vector y:
```
map(y, function(x) x^3 + 1)
```
In the course of your explanation, say whether the result is a vector or a list.
Which do you think works faster for a given numerical vector y? This code:
```
map(y, function(x) sqrt(x))
```
Or this code?
```
  sqrt(y)
```
Justify your answer with a convincing example, using system.time(). What moral do you draw from this?

To refactor computer code is to rewrite the code so that it does the same thing, but in a different way. We might refactor code in order to make it more readable by humans, or to make it perform its task more quickly.

Refactor the following code so that it uses keep() instead of a loop:

df <- bcscr::m111survey
keepVariable <- logical(length(names(df)))
for ( col in seq_along(keepVariable) ) {
  var <- df[, col]
  isNumeric <- is.numeric(var)
  allThere <- !any(is.na(var))
  keepVariable[col] <- isNumeric && allThere
}
newFrame <- df[, keepVariable]
head(newFrame)

The following function produces a list of vectors of uniform random numbers, where the lower and upper bounds of the numbers are given by the arguments to the parameters lower and upper respectively, and the number of vectors in the list and the number of random numbers in each vector are given by a vector supplied to the parameter vecs.
```
randomSims <- function(vecs, lower = 0, upper= 1, seed = NULL) {
  # set seed if none is provided by the user
  if ( !is.null(seed) ) {
    set.seed(seed)
  }

  lst <- vector(mode = "list", length = length(vecs))
  for ( i in seq_along(vecs) ) {
    lst[[i]] <- runif(vecs[i], min = lower, max = upper)
  }
  lst
}
```
Refactor the code for randomSims() so that it uses map() instead of a loop.

The following enhanced version of randomSims() is even more flexible, as it allows both the upper and lower limits for the randomly-generated numbers to vary with each vector of numbers that is produced.

randomSims2 <- function(vecs, lower, upper, seed = NULL) {
  # validate input
  if (!(length(vecs) == length(upper) && length(upper) == length(lower)) ) {
    return(cat("All vectors entered must have the same length."))
  }
  if ( any(upper < lower) ) {
    return(cat(paste0("Every upper bound must be at least as ",
                      "big as the corresponding lower bound.")))
  }
  # set seed if none is provided by the user
  if ( !is.null(seed) ) {
    set.seed(seed)
  }

  lst <- vector(mode = "list", length = length(vecs))
  for ( i in seq_along(vecs) ) {
    lst[[i]] <- runif(vecs[i], min = lower[i], max = upper[i])
  }
  lst
}

Use pmap() to refactor the code for randomSims2() so as to avoid using the loop.

Supposing that y is a numerical vector, explain in words what the following code produces:
```
y %>% keep(function(x) x >= 4)
```
Write a line of code using the sub-setting operator [ that produces the same result as the code in the previous problem.
Use keep() to write a function called oddMembers() that, given any numerical vector, returns a vector containing the odd numbers of the given vector. Your function should take a single argument called vec, the given vector. A typical example of use would be as follows:
```
oddMembers(vec = 1:10)
```
```
## [1] 1 3 5 7 9
```

You are given the following list of character vectors:

   lst <- list(c("Akash", "Bipan", "Chandra", "Devadatta", "Raj"),
        c("Raj", "Vikram", "Sita", "Akash", "Chandra"),
        c("Akash", "Raj", "Chandra", "Bipan", "Lila"),
        c("Akash", "Vikram", "Devadatta", "Raj", "Lila"))

Use reduce() and the union() function to obtain a character vector that is the union of all the vectors in lst.

Remember the function subStrings() from the exercises of the Chapter on Strings? Refactor it so that it does EXACTLY the same thing but makes no use of loops.

(*) Run the following code to see what it does. Then analyse the code and explain in detail how it works. Along the way you will have to study some unfamiliar R-functions.

dataDir <- "datafiles"
if ( !dir.exists(dataDir) ) dir.create(dataDir)

urls <- c(
  "https://data.cityofnewyork.us/api/views/25th-nujf/rows.csv",
  "https://data.seattle.gov/api/views/cf52-s8er/rows.csv",
  "https://data.ct.gov/api/views/kbxi-4ia7/rows.csv"
)

fileNameFrags <-
  urls %>%
  str_match("data\\.(\\w+)\\.") %>% 
  .[, 2]

processFiles <- function(url, frag) {
  csvDest <- file.path(dataDir,
                       paste0(frag, ".csv"))
  download.file(url, csvDest)
  assign(frag, read_csv(csvDest))
  do.call(
    save,
    args = list(
      frag,
      file = file.path(
        dataDir,
        paste0(frag, ".rda"))))
}

list(urls, fileNameFrags) %>% 
  pwalk(processFiles)