1  Our First Computer Programs

The Yellow Brick Road, by Jorge Cruz. clipartfest.com.

1.1 Hello, World!

Let’s write our very first computer program. Type the following code into the console and press Enter:

cat("Hello, World!")

In the console you should see the following output:

Hello, World!

That’s it—you just wrote a computer program. A computer program is simply a sequence of instructions that perform a specific task when they are executed (carried out) by the computer. In the above example, there was only one instruction in the sequence: it was the command to reproduce the string “Hello, World” in the console exactly as it is. The name of that command is cat.

A couple of things to note about your first program:

  • "Hello, World!" is known as a string. A string is how a computer represents text. In R, a string begins and end with the quote-marks ". The quote marks function as *delimiters”, meaning that they let R know when the string starts and when it stops. (So actually, the string is just what’s between the quotes: Hello, World!.)
  • In R, the command cat is a function. Just as in mathematics, a function in R is prepared to take input and to return some corresponding output. The cat function takes its input and prints it out to the console in a very simple, readable way. In this program, the input to cat is the string "Hello World".
  • To make the cat-function, do its work, the program had to call it. This was accomplished by following cat, the name of the function, with a matching pair of parentheses that contained the input between them. (So actully, because the parentheses told R when the input begins and end, they are another type of delimiter.)

Let’s try another small program.

Type the following code into the console and press Enter:

cat(2+2)

In the console you should see the following output:

4

We have been using the cat() function so far. Actually you can get output to the console without it. For example, suppose you type:

"Hello, World!"

Then in the console you see:

[1] "Hello World"

There are quote-marks around the text: that’s not so pretty, but we still get the basic output.

Similarly, you can try:

2 + 2
[1] 4

Notice that when we don’t use cat() the output begins with a strange [1]. In the next chapter we’ll learn why this happens; for now, just ignore it.

1.2 A Quick Tour

We now embark on a tour of some of R’s basic capabilities. In later chapters we will examine in detail the programming concepts that underlie the features we explore in the remainder of this Chapter.

1.3 Basic Arithmetic

R can be treated like a calculator. You can:

  • add numbers (+)
  • subtract numbers (-)
  • multiply numbers (*)
  • divide numbers (/)
  • raise a number to a power (^)

Just as on a graphing calculator, parentheses can be used to clarify the order of operations.

Here are some examples:

To get \(\frac{27-3}{10}\), use:

(27-3)/10
[1] 2.4

To get \(3^2 + 4^2\) try:

3^2+4^2
[1] 25

Sometimes you’ll want to take roots. As with a calculator, you can accomplish this by raising your number to a fractional power. So if you want \(\sqrt[3]{64}\) then you could try:

64^(1/3)
[1] 4

If you would like square roots then you can either raise your number to the \(1/2\)-power or you could use R’s special square-root function:

sqrt(64)
[1] 8

One way or another, you can evaluate quite complex mathematical expressions. For example, to get \(\sqrt{3^2 + 4^2}\) simply type:

sqrt(3^2+4^2)
[1] 5

1.3.1 Read-Evaluate-Print-Loop

So far you have been using R in what computer scientists call interactive mode. This means that you type something in at the console; R immedidately reads what you type and evaluates it, and prints the resulting value to the console for you to see. Then you type something else, and so on. This back-and-forth process is often called the Read-Evaluate-Print-Loop, or REPL for short. R is one of several computer languages that make it easy for you to see the results of its computations in the console. That’s because it was originally designed for use by statisticians and data analysts, who often want to run a small procedure, check on the results and then try a new or related procedure and check on the results … until their analysis is complete. From our point of view as beginning programmers, though, the REPL makes it easy to see what R is doing and to get immediate feedback on the very simple programs that we are now writing.

1.4 Variables

Quite often you will want to use the same value several different times. You can so this by creating a variable with the assignment operator <-.

a <- 10

The previous statement puts the value 10 in the computer’s memory and causes the name a to be bound to it. This means that if you ask R to show you a, you’ll get that value:

a
[1] 10

Now you can use a as much as you like. Whenever you use it, R will know that it stands for the value 10:

a + 23
[1] 33
sqrt(a)
[1] 3.162278

Later on if you want to bind the name a to a different value, you can do so, with another assignment-statement:

a <- 4
a + 23
[1] 27

Let’s write some code to introduce creatures of various types. A creature should give his or her name, say what type of creature he or she is, and name a favorite food.

creatureType <-"Munchkin"
creatureName <- "Boq"
creatureFood <- "corn"

Notice that I chose variable-names that are descriptive of the values to which they are bound. That’s often a good practice.

Next, let’s combine our items into a greeting:

paste(
  "Hello, I am a ",
  creatureType,
  ". My name is ",
  creatureName,
  ".  I like to eat ",
  creatureFood,
  ".",
  sep = ""
)
[1] "Hello, I am a Munchkin. My name is Boq.  I like to eat corn."

We see that paste() function puts strings together. The sep = "" argument at the end specifies that no space is to be inserted between the strings when they combined.

Another thing we notice in the previous code is that R can ignore white space: we were able to place the parts of the command on different lines. This helps prevent our lines from being too long, and allows us to arrange the code so that it’s easy to read.

Spaces do matter inside a string, though:

kalidah <- "Teddy"
kalidah
[1] "Teddy"
kalidah2 <- "Ted  dy"
kalidah2
[1] "Ted  dy"

You must also be careful not to insert spaces within the name of any object:

kali dah2
## Error: unexpected symbol in "kali dah2"

R got confused by the unexpected space: it knows about the name kalidah2, but kali dah2 means nothing to R.

Getting back to the Oz-creatures: it would be nice if a creatures’s greeting could be split over several lines. This is possible if you use the special string “\n”, which produces a newline. Just incorporate it into your message, as follows:

paste(
  "Hello, I am a ",
  creatureType,
  ".\nMy name is ",
  creatureName,
  ".\nI like to eat ",
  creatureFood,
  ".",
  sep = ""
)
[1] "Hello, I am a Munchkin.\nMy name is Boq.\nI like to eat corn."

That doesn’t look like an improvement at all! But what if we were to cat() it?

message <- paste(
  "Hello, I am a ",
  creatureType,
  ".\nMy name is ",
  creatureName,
  ".\nI like to eat ",
  creatureFood,
  ".",
  sep = ""
)
cat(message)
Hello, I am a Munchkin.
My name is Boq.
I like to eat corn.

That’s much nicer.

That last example showed that you can use variables together with functions to create new variables. Here is another example:

a <- 10
b <- 27
mySum <- a + b
mySum
[1] 37

1.4.1 More on cat and paste()

The cat() function can deal with more than just one input. When you give it two or more inputs, it behaves just like the paste() function does: it puts them together (concatenates them, thus the name cat), Then, unlike the paste() function, it puts the result out to the console in its own simple way.

For example:

cat("Hello", "big", "beautiful", "world!")
Hello big beautiful world!

Notice that when cat() combines its input, it puts a single space between them. If you want something different in between, you can make use of the sep argument. For example, to put asterisks in between, write:

cat("Hello", "big", "beautiful", "world!", sep = "*")
Hello*big*beautiful*world!

To put three hyphens between the inputs, you would write:

cat("Hello", "big", "beautiful", "world!", sep = "---")
Hello---big---beautiful---world!

sep can be set to anything you like, including the empty string "": two quote-marks with nothing in between them. The empty string can be quite useful as a seperator when you want to control exactly how your inputs go together, as in:

cat(
  "This is a long message so I am ",
  "breaking it up into several inputs. ",
  "\nI just moved to a new line, and now ",
  "for some\nhyphenation-",
  "experimentation!",
  sep = ""
)
This is a long message so I am breaking it up into several inputs. 
I just moved to a new line, and now for some
hyphenation-experimentation!

The paste() function uses the sep argument in same way as cat().

1.4.2 Practice Exercises

  1. With just one call to cat(), get the following output:
Hello,
big, wide
wonderful
world!

1.4.3 Solutions to the Practice Exercises

  1. There are many ways to do it. Here is one way:
cat(
  "Hello,\n",
  "big, wide\n",
  "wonderful\n",
  "world!",
  sep = ""
)
Hello,
big, wide
wonderful
world!
Here is another:
cat("Hello,\nbig, wide\nwonderful\nworld!")
Hello,
big, wide
wonderful
world!

1.5 Functions

Let’s say that we want to introduce George the Quadling. We might try:

creatureName <- "George"
creatureType <- "Quadling"
creatureFood <- "cookies"
cat(message)
Hello, I am a Munchkin.
My name is Boq.
I like to eat corn.

Hmm, that didn’t go so well: we got Boq instead. The problem is that the variablemessage was created using the original values of creatureName, creatureType and creatureFood, not the new values that we are interested in. To do it right we should have re-made message, as follows:

creatureName <- "George"
creatureType <- "Quadling"
creatureFood <- "cookies"
message <- paste(
  "Hello, I am a ",
  creatureType,
  ".  \nMy name is ",
  creatureName,
  ".\nI like to eat ",
  creatureFood,
  ".",
  sep = ""
)
cat(message)
Hello, I am a Quadling.  
My name is George.
I like to eat cookies.

That’s great, but it seems that every time we introduce a new creature we have to type a lot of code. It would be much better if we could find a way to re-use code, rather than repeating it.

Functions allow us to re-use code. Let’s define a function to do introductions:

intro <- function(name, type, food) {
  message <- paste(
    "Hello, I am a ",
    type,
    ".  \nMy name is ",
    name,
    ".\nI like to eat ",
    food,
    ".",
    sep = ''
  )
  cat(message)
}

In the console nothing happens. We only created the function intro(), we haven’t called it yet. Let’s call intro:

intro(name = "Frederick", type = "Winkie", food = "macaroni")
Hello, I am a Winkie.  
My name is Frederick.
I like to eat macaroni.

R allows you to be lazy: you can omit the parameters name, type and food, so long as you indicate what their values should be, in the correct order:

intro("Frederick", "Winkie", "macaroni")
Hello, I am a Winkie.  
My name is Frederick.
I like to eat macaroni.

1.5.1 Practice Exercises

  1. Modify the function intro() so that the person who introduces herself says:
  • what her name is (name);
  • a favorite hobby (hobby);
  • a color that she does not like (disliked_color).

For example, the result of the following function call:

intro(
  name = "Maria",
  hobby = "throwing darts", 
  disliked_color = "turquoise"
)

should be:

Hello, my name is Maria!
One of my favorite hobbies is throwing darts.
A color that I do not like is turquoise.

1.5.2 Solutions to the Practice Exercises

  1. This will do:
intro <- function(name, hobby, disliked_color) {
 message <- paste(
   "Hello, my name is ",
   name,
   "!\nOne of my favorite hobbies is ",
   hobby,
   ".\nA color that I do not like is ",
   disliked_color,
   ".",
   sep = ""
  )
cat(message)
}

To make sure, let’s try it out:

intro(
   name = "Maria",
   hobby = "throwing darts", 
   disliked_color = "turquoise"
)
Hello, my name is Maria!
One of my favorite hobbies is throwing darts.
A color that I do not like is turquoise.

To make extra sure, try it out with new inputs:

intro(
   name = "Thomas",
   hobby = "throwing hand grenades", 
   disliked_color = "pale green"
)
Hello, my name is Thomas!
One of my favorite hobbies is throwing hand grenades.
A color that I do not like is pale green.

1.6 Data and Graphics

Anyone can use R, but it was created for statisticians, so it has many features that are helpful in data analysis. Let’s take a quick look at a data set from a contributed R package, the package mosaicData (Pruim, Kaplan, and Horton 2023).

First, we’ll attach the package, so R can find all of the goodies it contains:

library(mosaicData)

Package mosaicData contains a number of interesting datasets that are useful in the teaching of statistics. Let’s look into one of them—Births78—using R’s help() function:

help("Births78")

Description

 Number of births in the United States. There are several data sets
 covering different date ranges and obtaining data from different
 sources.

Format

 A data.frame with the following 8 variables.
 
 'date' Date
 
 'births' Number of births on 'date' (integer)
 
 'wday' Day of week (ordered factor)
 
 'year' Year (integer)
 
 'month' Month (integer)
 
 'day_of_year' Day of year (integer)
 
 'day_of_month' Day of month (integer)
 
 'day_of_week' Day of week (integer)

View

Data Table 1.1

We learn that Births78 is a data frame containing information on the number of births each day, during the year 1978. (A data frame is one of R’s most important data structures. We’ll learn more about them in Chapter 7.) The frame has 365 rows, one for each day in the year, and four columns. Each column contains the values of a variable recorded for each day:

  • the calendar date of that day;
  • births: the number of children born in the United States on that day;
  • dayofyear: the number of the day within the year 1978 (1 being January 1, 2 being January 2, and so on);
  • wday: the day of week for that day (Sunday, Monday, etc.).

We can view the first few row of the data frame using R’s head() function:

head(Births78, n = 10)
         date births wday year month day_of_year day_of_month day_of_week
1  1978-01-01   7701  Sun 1978     1           1            1           1
2  1978-01-02   7527  Mon 1978     1           2            2           2
3  1978-01-03   8825  Tue 1978     1           3            3           3
4  1978-01-04   8859  Wed 1978     1           4            4           4
5  1978-01-05   9043  Thu 1978     1           5            5           5
6  1978-01-06   9208  Fri 1978     1           6            6           6
7  1978-01-07   8084  Sat 1978     1           7            7           7
8  1978-01-08   7611  Sun 1978     1           8            8           1
9  1978-01-09   9172  Mon 1978     1           9            9           2
10 1978-01-10   9089  Tue 1978     1          10           10           3

We might wonder whether the number of births varies with the time of year. One way to investigate this question is to make a scatterplot, where the days of the year (numbered 1 through 365) are on the horizontal axis and the number of births for each day are on the vertical axis. Figure 1.1 shows such a plot.1

Figure 1.1: A simple scatterplot with R’s ggplot2 graphics system.

Clearly the number of births varies seasonally: more babies are born in late summer and early fall, whereas spring births are not as frequent. But there is something mysterious about the plot: Why do there are appear to be two clearly separated groups of days, one with considerably more births than the other? What is going on here? As we learn to program in R, we will gradually acquire the skills needed to answer this and many other intriguing questions.

1.7 More in Depth

Read this section if you are keen to more about what you can do with R, as soon as possible.

1.7.1 Error Messages and Strings

It’s easy to make mistakes when you program—even when you are very experienced! Incorrect computer code is said to have a bug, and the art of finding bugs and correcting them is called debugging.

Consider the following code:

scarecrowQuote <- "It is such an uncomfortable feeling to know one is a fool."
paste("The Scarecrow says: ", scarecrowquote)

When we run it we get the following error message in the console:

Error: object 'scarecrowquote' not found

R’s error messages are often quite mysterious—they are intended to be maximally useful to experienced R programmers—but it’s always a good idea to read them anyway. In this case the message clearly tells us the problem: R cannot find the object scarecrowquote on its search path. This prompt us to look more closely at the name scarecrow, and sooner or later we will realize that we have a misspelling: the variable that was actually defined was scarecrowQuote, with a capital Q.

The correct code is:

scarecrowQuote <- "It is such an uncomfortable feeling to know one is a fool."
paste("The Scarecrow says: ", scarecrowQuote)
[1] "The Scarecrow says:  It is such an uncomfortable feeling to know one is a fool."
Important

Always bear in mind that R is case-sensitive!

Here’s another buggy bit of code:

SermonMountComment <- 
  paste0(
    "Oh, it's "blessed are the meek."",
    "\nI'm glad they are getting something:\n",
    "they have a hell of a time."
  )
cat(SermonMountComment)

The idea is to produce:

Oh, it's "blessed are the meek.
I'm glad they are getting something:
they have a hell of a time.

But when we run the code we get the following results instead:

SermonMountComment <- 
  paste0(
    "Oh, it's "blessed are the meek."",
    "\nI'm glad they are getting something:\n",
    "they have a hell of a time."
  )
Error in parse(text = input): <text>:3:16: unexpected symbol
2:   paste0(
3:     "Oh, it's "blessed
                  ^
cat(SermonMountComment)
Error: object 'SermonMountComment' not found

This can be a bit more difficult to read. The problems appear to start near the beginning of the construction of the string SermonMountComment.

After looking at it a while we focus on the first string argument to the paste() function:

"Oh, it's "blessed are the meek.""

We see that this string has quotes within quotes. Now R uses quotes as delimiters for strings: that is, quote-marks indicate where a string begins and where it ends. Hence from R’s point of view, the first string consists of just: "Oh, it's ". But then there is no comma to separate this string from the next string argument that the paste() functions expects. Instead R sees the b in blessed; that’s an unexpected symbol. Things go downhill from there.

There are a couple of ways to correct the problem. One approach is to use single quotes inside any string that is delimited with double quotes, thus:

SermonMountComment <- 
  paste0(
    "Oh, it's 'blessed are the meek.'",
    "\nI'm glad they are getting something:\n",
    "they have a hell of a time."
  )
cat(SermonMountComment)
Oh, it's 'blessed are the meek.'
I'm glad they are getting something:
they have a hell of a time.

On the other hand if you really want those double-quotes inside the string, you can escape their special meaning as string-delimiter by prepending a backslash (\) to them, thus:

SermonMountComment <- 
  paste0(
    "Oh, it's \"blessed are the meek.\"",
    "\nI'm glad they are getting something:\n",
    "they have a hell of a time."
  )
cat(SermonMountComment)
Oh, it's "blessed are the meek."
I'm glad they are getting something:
they have a hell of a time.

There are a number of special characters that are formed by “escaping” the usual meaning of some other character. Some common examples are:

  • \n: produces a newline instead of n
  • \t: produces a tab-space instead of t
  • \": produces an actual quote-mark, instead of beginning or ending a string.

Strings are a tricky topic in any computer programming language: in fact we will devote all of Chapter 11 to them.

The Main Ideas of This Chapter

This Chapter is mostly an overview to get you interested in the prospect of programming, but there are a few things you should know for use in future chapters:

  • Study the definitions of key terms (see the Glossary ni the next section).
  • Know how to use paste() and cat(), and know how to use sep with with them when they have more than one input.
  • To make something a string, you have to put "s around it.
  • In a string, \n takes you to a new line.

Glossary

Computer Program

A sequence of instructions that performs a specific task when executed by a computer.

String

A value in a computer program that constitutes text (as opposed to numbers of some other type of data).

Interactive Mode

A type of engagement between a human and a computer in which the computer prompts the humand for data and/or commands and may respond with output that the human can read and/or interpret.

Read-Evaluate-Print Loop

An interactive cycle in which the R-interpreter reads an expression from the console, evaluates it, and prints out the value to the console.

Data Structure

A particular way of organizing information in an computer program so that it can be used efficiently.

Delimiter

A character in a programming language that is used to mark the beginning and/or end of a value.

Exercises

Exercise 1

Write a program that modifies the function intro() (see Section 1.5) so that the person who introduces him or herself states a favorite sport. For example, the result of the following function call:

intro(name = "Bettina", type = "human", sport = "lacrosse")

should be:

Hello, I am a human.  
My name is Bettina.
My favorite sport is lacrosse.

Exercise 2

Write a program that modifies the function intro() (see Section 1.5) so that the person who introduces him or herself says what planet he or she is from. For example, the result of the following function call:

 intro(name = "Zarg", planet = "Jupiter", food = "rubber bands")

should be:

Hello, my name is Zarg!
My home planet is Jupiter.
I like to eat rubber bands.

Exercise 3

Write a program to produce the following output to the console:

*
**
***
**
*

Exercise 4

Write a program to produce the following output to the console:

meow!
hiss!  hiss!
meow!
hiss! hiss! hiss!
meow

Exercise 5

Suppose we want to cat “Hello, World” to the console, and we enter:

cat(Hello, World!)
What does R say?  What did we do wrong?

Exercise 6

Suppose we want to cat “Hello, World” to the console, and we enter:

cat["Hello, World!"]
What does R say?  What did we do wrong?

  1. The plot is made with the ggplot2 graphics package (Wickham et al. 2024). Graphing will not be a major focus of the course at first, but we will return from time to time, to the subject of graphing in ggplot2 as our need for graphs dictates.↩︎