6.4 Example: Will They Connect?

Anna and Raj make a date for coffee tomorrow at the local Coffee Shop. After making the date, both of them forget the exact time they agreed to meet: they can only remember that it was to be sometime between 10 and 11am. Each person, independently of the other, randomly picks a time between 10 and 11 to arrive. If Anna arrives and Raj is not there, she will wait up to ten minutes for him, but will leave if he does not show within that time period. Raj is similarly disposed: he will wait ten minutes—but no more—for Anna. What is the chance that they meet?

The key to designing a simulation procedure is to realize that Anna and Raj will connect if and only if the difference between their arrival times is less than 10 minutes. It doesn’t matter who arrived first, so long as the difference is less than 10. This means that we want to check on the absolute value of Anna’s arrival time minus Raj’s arrival time. The following code implements this idea:

meetupSim <- function(reps = 10000, table = FALSE, seed = NULL) {
  if ( !is.null(seed) ) {
    set.seed(seed)
  }
  anna <- runif(reps, 0, 60)
  raj <- runif(reps, 0, 60)
  connect <- (abs(anna - raj) < 10)
  if ( table ) {
    cat("Here is a table of the results:\n\n")
    print(table(connect))
    cat("\n")
  }
  cat("The proportion of times they met was ", mean(connect), ".\n", sep = "")
}

Let’s try it out:

meetupSim(reps = 100000, table = TRUE, seed = 3939)
## Here is a table of the results:
## 
## connect
## FALSE  TRUE 
## 69781 30219 
## 
## The proportion of times they met was 0.30219.

6.4.1 Vectorization vs. Looping

Whenever you perform a simulation you have the option to use a for-loop, running through the loop once for each repetition of the random process in question. For example, you could rewrite the meetup-simulation as follows:

meetupSim2 <- function(reps = 10000, table = FALSE, seed = NULL) {
  if ( !is.null(seed) ) {
    set.seed(seed)
  }
  #create an empty vector to hold the results:
  connect <- numeric(reps)
  # loop through:
  for ( i in 1:reps ) {
    # get one arrival time for anna:
    anna <- runif(1, 0, 60)
    # and one for raj:
    raj <- runif(1, 0, 60)
    #compute result and record in conncect:
    connect[i] <- (abs(anna - raj) < 10)
  }
  #the rest is the same as in meetupSim:
  if ( table ) {
    cat("Here is a table of the results:\n\n")
    print(table(connect))
    cat("\n")
  }
  cat("The proportion of times they met was ", mean(connect), ".\n", sep = "")
}

Be aware, though, that when you use a loop in place of vectorization your routine is liable to run more slowly. We can measure the difference with the R-function system.time(). As an illustration, let’s compare the running times for the vectorized and the looping versions of the meetup simulation.

First we get the time for the thousand repetitions of the meetup, using the original vectorized function:

system.time(meetupSim(reps = 10000, seed = 4040))
## The proportion of times they met was 0.3079.
##   user  system elapsed 
##  0.002   0.001   0.003 

Our concern is with the total elapsed time: a mere three-thousandths of a second!

Next we get the time for the same number of repetitions, using the looping implementation:

system.time(meetupSim2(reps = 10000, seed = 4040))
## The proportion of times they met was 0.2993.
##    user  system elapsed 
##  0.056   0.007   0.065 

Here the elapsed time is 0.065 seconds. This still seems pretty fast, but it’s more than 20 times as long as for the vectorized simulation. For more complex simulations such a dramatic slowdown could pose serious practical problems.

When performance is an issue, prefer vectorization to looping as much as possible.