6.5 Example: the Appeals Court Paradox

The following example is discussed in (J.Nahin 2008).

An appeals court generally consists of an odd number of justices, so the court will not be hampered with tie-votes and will thus always be able to render a decision for any case brought before it. Let’s imagine an appeals court that has five members. Let us further imagine that for each case before the court, each of the justices, independently of the others, makes a decision for either one or the other of the two opposing parties. Assume also for the sake of simplicity that there is always one side that is “in the right.” It follows that each judge is making a decision that is either correct or incorrect. The judges then report their decisions to one another, and the decision of the court is determined by majority vote. To be precise: whichever side gets three or more votes is the side that wins.

Although all of the members of the court are pretty sharp, they differ somewhat in their legal abilities:

In any case that comes before her, Judge A has a 95% chance to decide correctly.
Judge B has a 94% chance to judge correctly.
Judges C and D each have a 90% chance to judge correctly.
Judge E is the weak link on the court, with a mere 80% chance to judge correctly.

We are interested in estimating the probability that the majority opinion of the court will be correct.

Before we write a simulation in full, we should consider how we are going to simulate something that has a specified percentage chance of happening. For example, how would we simulate decisions made by Judge A?

One approach is to use runif() with a cutoff:

number <- runif(1) # random number between 0 and 1
number < 0.95

## [1] FALSE

In repeated trials, a random real number between 0 and 1 will turn out to be less than 0.95 about 95% of the time.

Another—and more convenient—way is to use the function rbinom(). This function simulates the results of flipping a coin a number of times and counting how many heads one gets. The general form of a call to rbinom()is as follows:

rbinom(n, size, prob)

In this call:

size is how many times you plan to flip the coin;
prob is the chance on any flip that the coin will turn up heads;
n is how many times you plan to repeat the process of flipping the coin size times (counting up the number of heads each time).

In order to simulate flipping a fair coin 100 times, you could ask for:

rbinom(n = 1, size = 100, prob = 0.5)

## [1] 52

It seems that we got 52 heads!

If you want to go through the above process twenty times instead of just doing it once, you could ask for:

rbinom(n = 20, size = 100, prob = 0.5)

##  [1] 52 49 45 57 53 60 46 53 50 54 59 47 42 45 41 56 47 57 56 49

Nothing in rbinom() says that it addresses only coin-flipping. In general it serves as a model for any situation in which:

there is a fixed number of trials (e.g., flip of a coin, taking a free-throw shot, deciding about a court case);
each trial has two possible outcomes, often termed Success or Failure (e.g., coins lands heads or tails, you make the shot or you don’t, you decide the case correctly or you don’t);
the chance of the Success outcome is the same, on any trial (e.g., the coin always has the same chance of coming up heads, for any case Judge A has a 95% chance of being right);
the outcome of any trial is independent of the outcome of all other trials (e.g., making the first free throw does not increase or decrease one’s chances of making the next free throw, etc.)
you are counting up the number of successes in your trials.

In such a situation the count of the number of successes is called a binomial random variable.¹⁹ The size parameter in rbinom() gives the fixed number of trials. The prob parameter specifies the chance of success on each trial.

In this example we would like to simulate each judge deciding about each case. Suppose for example, that there are twenty cases. We would like to know, for each judge and each case, whether or not the judge was correct

Suppose we were to watch Judge A making 20 decisions, recording each time whether she was correct or not. In order to simulate the results with R, we could ask for:

rbinom(n = 20, size = 1, prob = 0.95)

##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1

In the above results, each 1 stands for a correct decision and the 0 stands for an incorrect decision.

Note that the following call would be incorrect:

rbinom(n = 1, size = 20, prob = 0.95)

This would give us a count of the number of times Judge A was correct, but it would not tell us which of the cases she judged correctly and which she judged incorrectly.

Assuming that the court hears 20 cases, we would like to simulate the decision of each judge in each case. We could store the results in variables, as follows:

a <- rbinom(n = 20, size = 1, prob = 0.95)  # Judge A
b <- rbinom(n = 20, size = 1, prob = 0.94)  # Judge B
c <- rbinom(n = 20, size = 1, prob = 0.90)  # Judge C
d <- rbinom(n = 20, size = 1, prob = 0.90)  # Judge D
e <- rbinom(n = 20, size = 1, prob = 0.80)  # Judge E

Now we come to the key idea in coding our simulation: in order to determine the total number of correct votes in each case, all we need to do is add the five vectors we created above:

correctVotes <- a + b + c + d + e

The court will decide correctly when the number of votes for the correct option is at least 3. The following logical vector records this:

courtCorrect <- (correctVotes >= 3)

Table 6.2 shows some possible results:

Table 6.2: Results of 20 simulated appeals court decisions.
A	B	C	D	E	Number Correct	Court Correct
1	1	1	0	0	3	TRUE
1	1	1	0	1	4	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	1	5	TRUE
1	1	1	0	1	4	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	0	4	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	1	5	TRUE
1	1	1	0	1	4	TRUE
1	1	0	1	0	3	TRUE
1	1	1	1	1	5	TRUE
1	1	1	0	1	4	TRUE
0	1	1	1	1	4	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	1	5	TRUE
1	1	1	1	0	4	TRUE
1	1	0	1	1	4	TRUE

Note that all of the court’s decisions were correct, even though in a couple of cases the correct decisions were barely in the majority.

We now have the basic idea of the simulation. In order to estimate the probability of a correct decision, we simply need to recast our idea in the form of a function that permits us to simulate a very large number of imaginary court cases and that will report the results to us. Here is the code for such a function.

courtSim <- function(reps = 10000,
                     seed = NULL,
                     table = FALSE,
                     probs = c(0.95, 0.94, 0.90, 0.90, 0.80)) {
  
  if ( !is.null(seed) ) {
    set.seed(seed)
  }
  
  # get the probabilities
  aProb <- probs[1]
  bProb <- probs[2]
  cProb <- probs[3]
  dProb <- probs[4]
  eProb <- probs[5]
  
  # simulate decisions of each judge:                 
  a <- rbinom(n = reps, size = 1, prob = aProb)
  b <- rbinom(n = reps, size = 1, prob = bProb)
  c <- rbinom(n = reps, size = 1, prob = cProb)
  d <- rbinom(n = reps, size = 1, prob = dProb)
  e <- rbinom(n = reps, size = 1, prob = eProb)
  
  # count the number of correct votes in each case:
  correctVotes <- a + b + c + d + e
  
  # determine whether court decided correctly, in each case:
  courtCorrect <- (correctVotes >= 3)
  
  # record results
  if ( table ) {
    cat("Here is a table of the results:\n\n")
    print(table(courtCorrect))
    cat("\n")
  }
  cat("The proportion of times the court was correct was ", 
      mean(courtCorrect), ".\n", sep = "")
}

Let’s now estimate the probability of the court rendering a correct verdict, using one hundred thousand simulated cases:

courtSim(reps = 100000, seed = 3838, table = TRUE)

## Here is a table of the results:
## 
## courtCorrect
## FALSE  TRUE 
##   787 99213 
## 
## The proportion of times the court was correct was 0.99213.

The court seems to be doing quite well! But of course this is not very surprising: after all, most of the judges make the correct decision almost all the time. It is interesting, though, that the chance of the full court rendering a correct verdict is higher than the chance of any individual judge to be correct. There appears to be a benefit to the voting procedure.

But things get even more interesting if we imagine that the judges recognize, after some time, that Judge E simply isn’t as sharp as Judge A, and that they pressure him into voting whichever way Judge A votes. “Surely,” they reason, “since Judge A is so good this new policy will increase our chance to hand down a correct verdict!”

We will simulate to see if the judges’ reasoning is correct. All we need to do is to count Judge A’s vote twice in the sum, and not count Judge E’s vote at all:

correctVotes <- 2*a + b + c + d

Here is the new simulation-function:

courtSim2 <- function(reps = 10000,
                     seed = NULL,
                     table = FALSE,
                     probs = c(0.95, 0.94, 0.90, 0.90, 0.80)) {
  
  if ( !is.null(seed) ) {
    set.seed(seed)
  }
  
  # get the probabilities
  aProb <- probs[1]
  bProb <- probs[2]
  cProb <- probs[3]
  dProb <- probs[4]
  eProb <- probs[5]
  
  # simulate decisions (this time, no need for Judge E) :             
  a <- rbinom(n = reps, size = 1, prob = aProb)
  b <- rbinom(n = reps, size = 1, prob = bProb)
  c <- rbinom(n = reps, size = 1, prob = cProb)
  d <- rbinom(n = reps, size = 1, prob = dProb)
  
  # count rhe number of correct votes in each case:
  correctVotes <- 2*a + b + c + d
  
  # determine whether court decided correctly, in each case:
  courtCorrect <- (correctVotes >= 3)
  
  # record results
  if ( table ) {
    cat("Here is a table of the results:\n\n")
    print(table(courtCorrect))
    cat("\n")
  }
  cat("The proportion of times the court was correct was ", 
      mean(courtCorrect), ".\n", sep = "")
}

Let’s try it out:

courtSim2(reps = 100000, seed = 3838, table = TRUE)

## Here is a table of the results:
## 
## courtCorrect
## FALSE  TRUE 
##  1221 98779 
## 
## The proportion of times the court was correct was 0.98779.

Hey! That’s a lower chance of success than before! The difference is small but significant: compelling the weakest judge to vote with the strongest judge actually decreased the court’s chance of rendering a correct verdict overall. This circumstance is sometimes called the Appeals Court Paradox, but it occurs in many other practical situations. More often than you might think, the benefit of independent voting can outweigh the advantage associated with relying solely on a small number of “experts.”

6.5.1 Practice Exercises

Use rbinom() to simulate taking twenty free throws, where you have a 70% chance to make each shot. Your simulation should produce a vector of 20 ones and zeros, where 1 stands for making the shot and 0 stands for missing it.
Use rbinom() to simulate taking twenty free throws, where you have a 70% chance to make each shot, and counting up how many shots you make. Your simulation should produce a vector of length one that represents the number of shots you made.
Every morning for 100 mornings in a row you take twenty free throws. You have a 70% chance to make each shot, and you always count up how many shots you make. Use rbinom() to simulate the counts. Your simulation should produce a vector of length 100 where the first element represents the number of shots you made on day 1, the second element represents the number of shots you made on day 2, etc.
You have a 70% chance to make each free throw shot you take. Your friend Lester has a 60% chance to make each shot. Both of you will take 20 shots. You are interested in knowing the probability that you will make more shots than Lester does. Write a function called freeThrowSim() that will help you estimate this probability. The function should take two parameters:
- seed, so that the user can set a seed;
- reps, the number of times to simulate a competition between you and Lester.
The function should return the estimated probability.

6.5.2 Solutions to the Practice Exercises

Try this:

rbinom(20, size = 1, prob = 0.70)

##  [1] 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 1 1 1 1

Try this:

rbinom(1, size = 20, prob = 0.70)

## [1] 14

Try this:
```
rbinom(100, size = 20, prob = 0.70)
```

Here’s the function if you decide to use a loop:

freeThrowSim <- function(reps, seed) {
  youWin <- logical(reps)
  for ( i in 1:reps ) {
    youMake <- rbinom(1, size = 20, prob = 0.70)
    lesterMakes <- rbinom(1, size = 20, prob = 0.60)
    youWin[i] <- (youMake > lesterMakes)
  }
  mean(youWin)
}

Here’s the function if you vectorize:

freeThrowSim <- function(reps, seed) {
  youMake <- rbinom(reps, size = 20, prob = 0.70)
  lesterMakes <- rbinom(reps, size = 20, prob = 0.60)
  youWin<- (youMake > lesterMakes)
  mean(youWin)
}

Let’s try it:

freeThrowSim(reps = 100000, seed = 3535)

## [1] 0.69132

You have nearly a 70% chance to make more shots than Lester.

“Binomial” is borrowed from a Greek words whose literal meaning is “two names”—corresponding to the two possible outcomes of a trial.↩︎