6.1 Probability and Random Variables

Your favorite college basketball team is in the NCAA Tournament. What is the chance that it will make the Final Four?

An insurance company provides flood insurance to homeowners. How much can the company expect to pay out in insurance claims in the coming year?

What is the chance of a large asteroid striking our planet in the next ten years? in the next ten million years?

All of the above questions involve random processes—some of them rather complex. It is the aim of this Chapter to apply our knowledge of vectors, functions and flow control to give approximate answers to questions involving chance processes like these.

Sometimes we will be interested in finding the probability that an event of interest occurs: for example, the event that your team makes the final Four, or the event that the Earth is struck by a large asteroid sometime in the next ten years.

One popular definition of probability goes as follows:

If \(E\) is an event, then the probability that \(E\) occurs is the long-term proportion of times that the event occurs, if we could repeat the random process many, many times.

To be more precise, imagine that you can repeat the random process \(n\) times. (Imagine that your team gets into the NCAA Tournament \(n\) times, for example.) Each time, the event of interest will either occur or not occur. Count the number of times that the event occurs. Then divide by \(n\). You now have the proportion of times that it occurred. Now imagine that \(n\) gets larger and larger. Our intuition says that the proportion of times that the event occurs will stabilize at some number between 0 and 1. This number is the probability that the event occurs. occurs.

We will also concern ourselves with random variables. A random variable is simply a number whose value is determined by the outcome of a chance process. The amount that the insurance company will pay out in the next year is an example of a random variable: its value depends on a complex chance process—how many homeowners experience a flood, how damaging each flood is, etc.

When it comes to random variables, we are often interested in what it might turn out to be on average. That is, suppose we could repeat the random process many, many times—say \(n\) times, where \(n\) is some large number. (Suppose that we could observe the insurance company for many, many years, with each year being the same in terms of how many people are insured, how much their houses are worth, what the climate is like, and so on.) Each time the process is repeated, we get a value for the random variable. We end up with \(n\) values. Now add up these values and divide by \(n\). We have computed their average—the mean, as it is properly called. Now imagine that \(n\) could be larger and larger, without bound.

The expected value of the random variable is the number that this average converges to.

In other words, the expected value of a random variable is what we expect to get on average, in the long run, if we could repeat the random process that produces the value many, many times.