4.2 Making Decisions: Conditionals

Another type of flow control involves determining which parts of the program to execute, depending upon certain conditions.

4.2.1 If Statements

Let’s design a simple guessing-game for the user:

  • The computer will pick randomly a whole number between 1 and 4.
  • The user will then be asked to guess the number.
  • If the user is correct, then the computer will congratulate the user.
number <- sample(1:4, size = 1)
guess <- as.numeric(readline("Guess the number (1-4):  "))
if ( guess == number ) {
  cat("Congratulations!  You are correct.")
}

The sample() function randomly picks a value from the vector that is is given. The size parameter specifies how many numbers to pick. (This time we only want one number.)

Flow control enters the picture with the reserved word if. Immediately after if is a Boolean expression enclosed in parentheses. This expression is often called the condition. If the condition evaluates to TRUE, then the body of the if statement—the code enclosed in the brackets—will be executed. On the other hand if the condition evaluates to FALSE, then R skips past the bracketed code.13

The general form of an if expression is as follows:

if ( condition ) {
  ## code to run when the condition evaluates to TRUE
}

The code above congratulates the a lucky guesser, but it has nothing at all to say to someone who did not guess correctly. The way to provide an alternative is through the addition of the else reserved-word:

number <- sample(1:4, size = 1)
guess <- as.numeric(readline("Guess the number (1-4):  "))
if ( guess == number ) {
  cat("Congratulations!  You are correct.")
} else {
  cat("Sorry, the correct number was ", number, ".\n", sep = "")
  cat("Better luck next time!")
}

The general form of an if-else expression is as follows:

if ( condition ) {
  # code to run if the condition evaluates to TRUE
} else {
  # code to run if condition evaluates to FALSE
}

An if-elsecan be followed by any number of if-else’s, setting up a chain of alternative responses:

number <- sample(1:4, size = 1)
guess <- as.numeric(readline("Guess the number (1-4):  "))
if ( guess == number ) {
  cat("Congratulations!  You are correct.")
} else if ( abs(guess - number) == 1 ){
  cat("You were close!\n")
  cat("The correct number was ", number, ".\n", sep = "")
} else {
  cat("You were way off.\n")
  cat("The correct number was ", number, ".\n", sep = "")
}

In general, a chain looks like this:

if ( condition1) {
  # code to run if condition1 evaluates to TRUE
} else if ( condition2 ) {
  # code to run if condition2 evaluates to TRUE
} else if ( condition3 ) {
  # code to run if condition2 evaluates to TRUE
} else if ......

  # and so on until
} else if ( conditionN ) {
  # code to run if conditionN evaluates to TRUE
}

4.2.2 Application: Validating Arguments

Recall the function manyCat():

manyCat <- function(word, n) {
  wordWithNewline <- paste(word, "\n", sep = "")
  lines <- rep(wordWithNewline, times = n)
  cat(lines, sep = "")
}

What would happen if a user were to call it with an unusable argument for the parameter n, a negative number, for instance?

manyCat(word = "Hello", n = -3)
## Error in rep(wordWithNewline, times = n): invalid 'times' argument

For us, it’s clear enough what is wrong. After all, we wrote the function in the previous chapter, and we know that the rep() function in its body requires that the times parameter be set to some positive integer. On the other hand, to someone who is unfamiliar with the body of manyCat() and who has no access to help on how manyCat() is to be used it may not be so obvious what has gone wrong and why.

In the case of complex functions, we cannot expect ordinary users to search through the function’s definition to learn how to fix an error that arises from improper input. Accordingly, it can be good practice to validate user-input. Conditionals allow us to do this.

Here, a possible approach is to attempt to coerce the user’s input for n into an integer, using the as.integer() function:

as.integer(3.6) # will round to nearest integer
## [1] 3
as.integer("4") # will convert string to number 4
## [1] 4
as.integer("4.3") # will convert AND round
## [1] 4
as.integer("two") # cannot convert to integer
## Warning: NAs introduced by coercion
## [1] NA

In the last example, the result is NA, and a cryptic warning was issued. In order to keep the warning from the user, we should wrap any call to as.integer() in the suppressWarnings() function.

Let’s try out a piece of code that checks for validity:

n <- "two"  # this will not convert to a number at all
converted <- suppressWarnings(as.integer(n))
!is.na(converted) && converted >= 1
## [1] FALSE

Think about what happened:

  • First we attempted to converted the "two" to an integer.
  • Since the result of as.integer("two") is NA, the expression is.na(converted) evaluates to TRUE
  • Hence !is.na(converted) evaluates to FALSE.
  • Hence !is.na(converted) && converted > 1 evaluates to FALSE.

Let’s try it on another “wrong” value:

n <- -2  # number, but it's negative
converted <- suppressWarnings(as.integer(n))
!is.na(converted) && converted >= 1
## [1] FALSE

This time converted gets an assigned value, namely -2, but since it’s not at least 1 the expression !is.na(converted) && converted >= 1 evaluates to FALSE.

Now let’s try it on a “good” value:

n <- 3
converted <- suppressWarnings(as.integer(n))
!is.na(converted) && converted >= 1
## [1] TRUE

Our code appears to be working well.

Now that we’ve figured out a way to determine whether any given input is a usable number, let’s employ a conditional to implement validation in manyCat():

manyCat <- function(word, n) {
  n <- suppressWarnings(as.integer(n))
  isValid <- !is.na(n) && n >= 1
  if (!isValid) {
    message <- "Sorry, n must be a whole number at least 1.\n"
    return(cat(message))
  }
  wordWithNewline <- paste(word, "\n", sep = "")
  lines <- rep(wordWithNewline, times = n)
  cat(lines, sep = "")
}

The idea is to force an early return—along with a helpful message to the Console—if the user’s argument for n is no good.

Let’s watch it in action:

manyCat(word = "Hello", n = "two")   # problem!
## Sorry, n must be a whole number at least 1.
manyCat(word = "Hello", n = 3)       # OK
## Hello
## Hello
## Hello

4.2.3 Application: Invisible Returns

Let’s think again about the \(\pi\)-computing function from Section 3.4.1:

madhavaPI <- function(n = 1000000) {
  k <- 1:n
  terms <- (-1)^(k+1)*4/(2*k-1)
  sum(terms)
}

We could use if to write in a “talky” option:

madhavaPI <- function(n = 1000000, verbose = FALSE) {
  k <- 1:n
  terms <- (-1)^(k+1)*4/(2*k-1)
  approx <- sum(terms)
  if ( verbose) {
    cat("Madhava's approximation is:  ", approx, ".\n", sep = "")
    cat("This is based on ", n, " terms.\n", sep = "")
  }
  approx
}

Try it out:

madhavaPI(n = 1000, verbose = TRUE)
## Madhava's approximation is:  3.140593.
## This is based on 1000 terms.
## [1] 3.140593

It’s a bit awkward that the approximation gets printed out at the end: after the message on the console, the user doesn’t need to see it. But if we were to remove the final approx expression, then the function would not return an approximation that could be used for further computations.

The solution to this dilemma is R’s invisible() function.

madhavaPI <- function(n = 1000000, verbose = FALSE) {
  k <- 1:n
  terms <- (-1)^(k+1)*4/(2*k-1)
  approx <- sum(terms)
  if ( verbose) {
    cat("Madhava's approximation is:  ", approx, ".\n", sep = "")
    cat("This is based on ", n, " terms.\n", sep = "")
  }
  invisible(approx)
}

If you wrap an expression in invisible(), then it won’t be printed out to the console:

madhavaPI(n = 1000, verbose = TRUE)
## Madhava's approximation is:  3.140593.
## This is based on 1000 terms.

Nevertheless it is still returned, as we can see from the following code, in which the approximation is computed without any output to the console and stored in the variable p for use later on in a cat() statement.

p <- madhavaPI() # verbose is FALSE by default
cat("Pi plus 10 is about ", p + 10, ".", sep = "")
## Pi plus 10 is about 13.14159.

4.2.4 Ifelse

The ifelse() function is a special form of the if-else construct that is used to make assignments, and is especially handy in the context of vectorization.

Suppose that you have a lot of heights:

height <- c(69, 67, 70, 72, 65, 63, 75, 70)

You would like to classify each person as either “tall” or “short,” depending on whether they are respectively more or less than 71 inches in height. ifelse() makes quick work of it:

heightClass <- ifelse(test = height > 70, 
                      yes = "tall", no = "short")
heightClass
## [1] "short" "short" "short" "tall"  "short" "short" "tall"  "short"

Note that ifelse() takes three parameters:

  • test: the condition you want to evaluate;
  • yes: the value that gets assigned when test is true;
  • no: the value assigned when test is false;

Most programmers don’t name the parameters. This is fine—just remember to keep the test-yes-no order:

ifelse(height > 70, "tall", "short")
## [1] "short" "short" "short" "tall"  "short" "short" "tall"  "short"

Here’s another example of the power of ifelese(). If a triangle has three sides of length \(x\), \(y\) and \(z\), then the sum of any two sides must be greater than the remaining side:

\[\begin{aligned} x + y &> z, \\ x + z &> y, \\ y + z &> x. \end{aligned}\] This fact is known as the Triangle Inequality. It works the other way around, too: if three positive numbers are such that the sum of any two exceeds the third, then three line segments having those numbers as lengths could be arranged into a triangle.

We can write a function that, when given three lengths, determines whether or not they can make a triangle:

isTriangle <- function(x, y, z) {
  (x + y > z) & (x +z > y) & (y + z > x)
}

isTriangle() simply evaluates a Boolean expression involving x, y and z. It will return TRUE when the three quantities satisfy the Triangle Inequality; otherwise, it returns FALSE. Let’s try it out:

isTriangle(x = 3, y = 4, z = 5)
## [1] TRUE

Recall that Boolean expressions can involve vectors of any length. So suppose that we are would like to know which of the following six triples of numbers could be the side-lengths of a triangle:

\[(2,4,5),(4.7,1,3.8),(5.2,8,12),\\ (6, 6, 13), (6, 6, 11), (9, 3.5, 6.2)\] We could enter the triples one at a time into isTriangle(). On the other hand we could arrange the sides into three vectors of length six each:

a <- c(2, 4.7, 5.2, 6, 6, 9)
b <- c(4, 1, 2.8, 6, 6, 3.5)
c <- c(5, 3.8, 12, 13, 11, 6.2)

Then we can decide about all six triples at once:

isTriangle(x = a, y = b, z = c)
## [1]  TRUE  TRUE FALSE FALSE  TRUE  TRUE

We could also use ifelse() to create a new character-vector that expresses our results verbally:

triangle <- ifelse(isTriangle(a, b, c), "triangle", "not")
triangle
## [1] "triangle" "triangle" "not"      "not"      "triangle" "triangle"

4.2.5 Switch

If you have to make a decision involving two or more alternatives you can use a chain of if ... else constructions. When the alternatives involve no more than the assignment of a value to a variable, you might also consider using the switch() function.

For example, suppose that you have days of the week expressed as numbers. Maybe it’s like this:

  • 1 stands for Sunday
  • 2 for Monday
  • 3 for Wednesday
  • and so on.

If you would like to convert a day-number to the right day name, then you could write a function like this:

dayWord <- function(dayNumber) {
  switch(dayNumber,
         "Sunday",
         "Monday",
         "Tuesday",
         "Wednesday",
         "Thursday",
         "Friday",
         "Saturday")
}
dayWord(3)
## [1] "Tuesday"

In switch() above, the first argument after dayNumber is what goes with 1, the second argument is what goes with 2, and so on.

When the item you want to convert is a string rather than a number, then the switch() function works a little bit differently. Suppose, for instance, that you want to abbreviate the names of the weekdays. You might write an abbreviation-function as follows:

abbrDay <- function(day) {
  switch(day,
         Monday = "Mon",
         Tuesday = "Tue",
         Wednesday = "Wed",
         Thursday = "Th",
         Friday = "Fri",
         Saturday = "Sat")
}
abbrDay("Wednesday")
## [1] "Wed"

In the above call to switch(), the weekday names you want to abbreviate appear as the names of named character-vectors, each of length one. The value of each vector is what the its name will be converted to.

When you are converting strings you have the option to provide a default conversion for values that don’t fit into the pattern you have in mind. All you have to do is to provide the default value as an additional argument. (It should NOT have a name.) Thus:

abbrDay <- function(day) {
  switch(day,
         Monday = "Mon",
         Tuesday = "Tue",
         Wednesday = "Wed",
         Thursday = "Th",
         Friday = "Fri",
         Saturday = "Sat",
         "not a weekday!")
}
abbrDay("Wednesday")
## [1] "Wed"
abbrDay("Neptune")
## [1] "not a weekday!"

4.2.6 Practice Exercises

  1. Consider the following function:

    computeSquare <- function(x) {
      x^2
    }

    Write a “talky” version of the above function that returns the square invisibly.. It should be called computeSquare2() and should take two parameters:

    • x: the number to be squared
    • verbose: whether or not to cat() out a report to the Console.

    Typical examples of use would be:

    computeSquare2(4)
    ## The square of 4 is 16.
    mySquare <- computeSquare2(6, verbose = FALSE)
    mySquare
    ## [1] 36
  2. Write a function called findKarl() that takes a character vector and returns a character vector that reports whether or not each element of the given vector was equal to the string "Karl". It should work like this:

    vec1 <- c("three", "blowfish", "Karl", "Grindel")
    findKarl(vec1)
    ## [1] "Sorry, not our guy." "Sorry, not our guy." "Yep, that's Karl!"  
    ## [4] "Sorry, not our guy."
  3. Here’s a function that is supposed to return "small!" when given a number less than 100, and return "big!" when the number if at least 100:

    sizeComment <- function(x) {
      if ( x < 100 ) {
        "small!"
      }
      "big!"
    }

    But it doesn’t work:

    sizeComment(200)  # this will be OK
    ## [1] "big!"
    sizeComment(50)   # this won't be OK
    ## [1] "big!"

    Fix the code.

  4. Add some validation to the isTriangle() function so that it stops the user if one or more of the parameters x, y and z cannot be interpreted as a positive real number.

4.2.7 Solutions to Practice Exercises

  1. Here’s the desired function:

    computeSquare2 <- function(x, verbose = TRUE) {
      if ( verbose ) {
        cat("The square of ", x, " is ", x^2, ".\n", sep = "")
      }
      invisible(x^2)
    }
  2. Here’s one way to write it:

    findKarl <- function(x) {
      ifelse(x == "Karl",
             "Yep, that's Karl!",
             "Sorry, not our guy.")
    }
  3. A function always returns the value of the last expression that it evaluates. As it stands, the function will always end at the line "big!", so "big" will always be returned. One way to get the desired behavior is to force the function to stop executing once it prints out "small!". You can do this with the return() function:

    sizeComment <- function(x) {
      if ( x < 100 ) {
        return("small!")
      }
      "big!"
    }

    Another way is to use the if ... else construction:

    sizeComment <- function(x) {
      if ( x < 100 ) {
        "small!"
      } else {
        "big!"
      }
    }
  4. Here is one approach:

    isTriangle <- function(x, y, z) {
      x <- suppressWarnings(as.numeric(x))
      x <- suppressWarnings(as.numeric(x))
      x <- suppressWarnings(as.numeric(x))
      xValid <- all(!is.na(x) & x > 0)
      yValid <- all(!is.na(y) & y > 0)
      zValid <- all(!is.na(z) & z > 0)
      if (!(xValid & yValid & zValid)) {
        return(cat("Sorry, all inputs must be positive real numbers.\n"))
      }
      (x + y > z) & (x +z > y) & (y + z > x)
    }

    Try it out:

    isTriangle(x = c(2,4,7),
               y = c(3, -4, 5),    # oops, a negative number
               z = c(6, 8, 10))
    ## Sorry, all inputs must be positive real numbers.
    isTriangle(x = c(2,4,7),
               y = c(3, 4, 5),     # fixed it
               z = c(6, 8, 10))
    ## [1] FALSE FALSE  TRUE

  1. Actually you don’t need the brackets if you plan to put only one expression in them. Many people keep the brackets, though, for the sake of clarity.↩︎