10.2 The Pipe Operator

The pipe operator looks like this: %>%. It comes from the magrittr package.26

The pipe operator connects two function calls by making the value returned by the first call the first argument of the second call. Here’s an example:

"hello" %>% rep(times = 4)
## [1] "hello" "hello" "hello" "hello"

This is the same as the more familiar:

rep("hello", times = 4)
## [1] "hello" "hello" "hello" "hello"

Here’s another example:

# same as nrow(bcscr::m111survey)
bcscr::m111survey %>% nrow()
## [1] 71

Here’s two pipes:

"hello" %>% rep(times = 4) %>% length()
## [1] 4

By default the value of the left-hand call is piped into the right-hand call as the first argument. You can make it some other argument by referring to it as the dot ., for example:

4 %>% rep("hello", times = .)
## [1] "hello" "hello" "hello" "hello"

Since sub-setting is actually a function call under the hood, you can use the dot there, too:

# gets the third element of the sequence 1, 4, 9, ..., 97:
seq(1, 100, by = 4) %>% .[3]
## [1] 9

The pipe operator isn’t all that useful when you only use it once or twice in succession. Its true value becomes apparent in the chaining together of many manipulations involving data frames.

10.2.1 Practice Exercises

  1. Rewrite the following call with the pipe operator, in three different ways:

    seq(2, 22, by = 4)
    ## [1]  2  6 10 14 18 22
  2. Consider mosaicData::CPS85:

    data("CPS85", package = "mosaicData")

    Use the pipe operator with subset() to find the row of mosaicData::CPS85 containing the worker who made more than 40 dollars per hour. Display only the sex, age and wage of the worker.

10.2.2 Solutions to the Practice Exercises

  1. Here are three ways:

    2 %>% seq(22, by = 4)
    22 %>% seq(2, ., by = 4)
    4 %>% seq(2, 22, by = .)
  2. Try this:

    CPS85 %>% 
      subset(wage > 40) %>% 
      .[, c("sex", "age", "wage")]
    ##     sex age wage
    ## 249   F  21 44.5

  1. magrittr is not attached in the tidy-verse, but most of the capability of this package is imported by dplyr.↩︎