3.6 A Note on Packages
We have seen that packages make up most of the search path when the active directory is the Global Environment. We have also mentioned a couple of packages explicitly—mosaicData and ggplot2 back in Section 1.2.5 for example. But what exactly is a package?
A package is a bundle of R-code (usually functions) and data that is organized according to well-defined conventions and documented so that users can learn how to use the code and data. When someone bundles code into a package it becomes easy to share it with others and and to re-use it for one task after another.
3.6.1 Installed Packages
When you click on the Packages tab in the lower right-hand pane in R Studio, you can see a list of all the packages that are installed on the machine. You can get the same information by running the command:
installed.packages()[, c("Package", "Version")]
In fact R is really nothing but a collection of packages. Many of the R-functions you have been learning about come from the package base. This is one of a number of packages that are automatically attached to the search path when an R session begins. Other packages have to be attached by you if you want immediate access to all of the functions and data that they contain.
In order to attach a package, you can click the little box next to its name in the Package tab in R-Studio, or you can attach it from the console with the command:
library(<name of package here>)
When you don’t want a package any more, you can detach it from the search path by un-clicking the little box, or by running this command:
detach("package:<name of package here>", unload=TRUE)
The package will still be installed, ready to be attached whenever you like.
3.6.2 Learning About a Package
You can learn about a Package by clicking on its name, or by using the command:
help(<name of package>)
From the display that shows in the Help pane you can navigate to learn about each of the functions and data sets that come with the package.
3.6.3 Installing Packages
You can also install additional packages on the computer. This can be done by clicking the Install button in R Studio and typing in the package name, or with the command:
install.packages("<name of package here>")
The package will be downloaded from the Comprehensive R Archive Network (CRAN) and installed in in your Home directory.
As long as we are working on the R Studio server, it’s a good idea to refrain from installing packages yourself, unless they are packages that we don’t use in class and that you simply want to explore on your own. That’s because when you install your own packages on the Server they go into a special directory in your Home folder and become part of your “User Library”. Packages that are installed by a system administrator for general use are in the “System Library.” If a package is in your User Library and in the System Library, when you ask to attach it you will get the version that it is your User Library. Now packages are updated from time to time, so it may happen that the version you have in your User Library will be different from the one in the System Library. If that is the case then your package might not work the same way for you as it does for the instructor and for other students: that can be confusing.
Eventually, though, you will install R and R Studio on your own computer, and then you will have to install many packages on your own.
Not all packages come from CRAN: many useful packages exist on other repositories, including the very popular code repository known as GitHub. Special functions exist to install R-packages from GitHub. For example, you may eventually wish to install the package tigerData, which resides in a GitHub repository belonging to your instructor. In order to install it, you would use the
install_github() function from the devtools package (Pruim, Kaplan, and Horton 2018), like this:
devtools::install_github(repo = "homerhanumat/tigerData")
There are a couple of things worth noting about the command above:
- The argument to
repohas two parts: the word before the “/” is the username of the individual who owns the repository; the word after the “/” is the name of the repository itself. For R-packages on GitHub, the name of the repository is the same as the name of the package.
- The double-colon
::is used to access a function from a package, without having to attach the entire package. Thus
devtools::install_github()refers to the function
install_github()in package devtools. Similarly, if you want to access, say, just the
Births78data set from the mosaicData package then you could refer to it as
Pruim, Randall, Daniel Kaplan, and Nicholas Horton. 2018. MosaicData: Project Mosaic Data Sets. https://CRAN.R-project.org/package=mosaicData.