We assume that the instructor is familiar with R, and has a basic understanding of the mosaic
package and its use in the teaching of statistics. Consult, e.g., the mosaic
vignette “Start Teaching Statistics Using R.”
In its current version package tigerstats
depends on the packages mosaic
, mosaicData
and abd
. All of these will be attached when one attaches tigerstats
:
library(tigerstats)
tigerstats
is used in an elementary classroom setting where students are taught to choose descriptive and inferential methods based upon an analysis of the variables that pertain to a question of interest. Thus if a student finds that she is interested in the relationship between two factor variables, then she knows that she can examine the question descriptively using bar charts, two-way tables, etc., whereas histograms, density plots and the like are not appropriate. The vignettes:
serve for students as a reminder of which procedures are appropriate for various types of questions about data (descriptive) or the population/processe from which the data is derived (inferential). Prospective instructors should consult these documents to get a sense of how the major package functions figure into an elementary course.
There are also vignettes to guide students on the use of specific functions that appear frequently in an elementary course. Some vignettes cover functions from R’s stats
package, others cover graphical functions from the lattice
package (although when tigerstats
is attached some of these, e.g., histogram()
actually call package mosaic
wrappers for the corresponding lattice
function). Of course some vignettes also cover functions from tigerstats
itself.
Wrapper-functions in tigerstats
always have names distinct from the function around which they wrap. For example, the tigerstats
function for t-procedures is ttestGC()
, not t.test()
. Hence no function from tigerstats
masks any other function from stats
, lattice
, or mosaic
. Thus if you call, for example, t.test()
when tigerstats
is attached, you will get mosaic
’s wrapper-function for the standard t-procedure function t.test()
in R’s package stats
.
As in package mosaic
, an attempt is made to provide a fairly uniform interface for all major functions:
myData
, the general format is:\[function(\sim var, data=myData,\ldots)\]
\[function(num \sim fac, data=myData,\ldots)\]
\[function(\sim fac1 + fac2, data=myData,\ldots)\]
This last format is in analogy with the formula interface used by R’s xtabs()
function, e.g.,
xtabs(~sex+seat,data=m111survey)
## seat
## sex 1_front 2_middle 3_back
## female 19 16 5
## male 8 16 7
\[function(y \sim x, data=myData,\ldots)\]
In any of the above cases, if the data
argument is not supplied or if a variable is not found in data
, then it will be searched for on a path that begins with the environment in which the function was called (typically the Global Environment, if the function is being used interactively). In particular, not every variable in the formula has to be present in the data frame that is typically supplied to data
. This mimics the behavior of R’s stats
functions.
Students meet the functions first in class or while reading Course Notes or studying slides associated with the course. As they become more familiar with the functions they are encouraged to used help features to refresh their memory quickly. As is typical with Help pages in an R package, the help for functions is often more helpful to an instructor or a developer than to a student. On the other hand, the examples for the major functions are intended to be fairly comprehensive, so calling example()
for a particular tigerstats
function is likely to be useful.
Students and instructors who desire a more extended refresher or a tutorial that covers most of the common uses of a given function should consult the vignette for that function. If you know the name of the function you are interested in, then you can access the vignette quickly using helpGC()
, thus:
helpGC(bwplot)
or
helpGC(ttestGC)
tigerstats
includes a number of instructional applets.
Some of the applets are Shiny apps:
CentralLiimit
: an exploration of the Central Limit Theorem.CIMean
: exploration of the coverage properties of t-interval for one population mean , when the standard deviation of the population is unknown.CoinFlip
: an exploration of inference in a binomial situation that can be used to introduce hypothesis testing early in a course.FindRegLine
: yet another game in which one tries to approximate the regression line. Score is kept.RandomExpBinom
: an exploration of a randomized experiments with two treatment groups, in which the response is Bernoulli. This can be used to explore inference as soon as randomized experimental designs are introduced in class.SamplingMethods
: This allows students to distinguish visually between simple random sampling, stratified sampling, and cluster sampling.ShallowReg
: Sometimes when students see a scatter plot that is based on a random sample from a bi-variate normal distribution, they comment that the regression line for the data looks “too shallow.” This app attempts to explain why the regression line is the preferred line for predicting \(y\)-values from \(x\)-values, even though the “SD line” (the line passing through \((\bar{x},\bar{y})\) and having slope \(s_y/s_x\)) sometimes appears to be the line that best summarizes the scatter plot.SlowGoodness
: This app introduces students to the chi-square test for goodness of fit.Type12Errors
: Exploration of Type-I and Type-II errors, in the context of inference for one population mean when the standard deviation is unknown.Any of the above apps may be run locally, using the shiny
package that is installed along with tigerstats
. For example, to run SlowGoodness
simply type:
shiny::runApp(system.file("SlowGoodness",package="tigerstats"))
In classroom settings one is often using the server version of the R Studio IDE. In this case, depending on how permissions are set by the sysadmin, it may not be possible to run Shiny apps locally. In that case you should install the apps on a Shiny server for remote access. Obtain the source code for the apps from the inst
directory of the tigerstats
package on Git Hub:
Some other apps make use of the manipulate
package that comes with R Studio, and can only be run in this environment. Some of the apps most likely to be used are:
BinomNorm()
and BinomSkew()
for exploring the binomial family of distributions;CIMean()
is a manipulate alternative to the Shiny app of the same nameCIProp()
explores coverage properties for confidence intervals for a single proportionDtrellHist()
and DtrellScat()
provide a dynamic form of trellis-graphics, for exploration of data conditioned upon the values of a external variable.EmpRule()
helps students understand when the 68-95 Rule can be trusted, and when it cannot.EmpRuleGC()
is a graphical calculator for the 68-95 Rule. Some students find it amusing to be able to use it.FindRegLine()
is the poor man’s version of the Shiny app of the same name.MeanSampler()
is a poor man’s version of the Shiny app CIMean
.SlowGoodness()
is the poor man’s version of the Shiny app of the same name.Type12Errors()
is the poor man’s version of the Shiny app of the same name.The function ChisqSimSlow()
is used to introduce the chi-square test for association between two factor variables. It does not require package manipulate
.
mosaic
and tigerstats
are used to teach elementary statistics at Georgetown College. The elementary course at Georgetown serves a dual purpose, as a recommended option for fulfilling the general education mathematics requirement and as a service course for several majors. Many of the course materials are online at:
http://statistics.georgetowncollege.edu
The reader may wish to consult this website, especially the Course Notes, to see how mosaic
and tigerstats
support the selection and arrangement of topics that are particular to this course.