You use ttestGC()
for inferential procedures regarding:
The function comes from the tigerstats
package, so make sure that tigerstats
is loaded:
require(tigerstats)
Read this section carefully. It talks about:
in ways that apply to all uses of ttestGC()
.
In the m111survey
data from the tigerstats
package, suppose you want a 95%-confidence interval for:
\(\mu =\) the mean fastest speed ever driven, for all GC students.
Use the function:
ttestGC(~fastest,data=m111survey)
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 100.959833 110.842984
To get any other level of confidence, you need to use the conf.level
argument, indicating your desired level of confidence in decimal (not percentage) form.
For example, for a 90%-confidence interval for \(\mu\), use
ttestGC(~fastest,data=m111survey,conf.level=0.90)
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 90% Confidence Interval for mu:
##
## lower.bound upper.bound
## 101.771329 110.031488
If you want a test of significance as well as the confidence interval, then
mu
argument to set what the Null Hypothesis thinks that the value of \(\mu\) is;alternative
argument to specify the Alternative Hypothesis.For example, if the hypotheses are:
\(H_0: \mu = 100\)
\(H_a: \mu > 100\)
Then use:
ttestGC(~fastest,data=m111survey,mu=100,alternative="greater")
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 101.771329 Inf
##
## Test of Significance:
##
## H_0: mu = 100
## H_a: mu > 100
##
## Test Statistic: t = 2.382
## Degrees of Freedom: 70
## P-value: P = 0.009974
If the hypotheses are:
\(H_0: \mu = 100\)
\(H_a: \mu < 100\)
Then use:
ttestGC(~fastest,data=m111survey,mu=100,alternative="less")
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## -Inf 110.031488
##
## Test of Significance:
##
## H_0: mu = 100
## H_a: mu < 100
##
## Test Statistic: t = 2.382
## Degrees of Freedom: 70
## P-value: P = 0.99
If the hypotheses are:
\(H_0: \mu = 100\)
\(H_a: \mu \neq 100\)
Then use:
ttestGC(~fastest,data=m111survey,mu=100,
alternative="two.sided")
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 100.959833 110.842984
##
## Test of Significance:
##
## H_0: mu = 100
## H_a: mu != 100
##
## Test Statistic: t = 2.382
## Degrees of Freedom: 70
## P-value: P = 0.01995
But note that the default value of alternative
is “two.sided”, so if you want you could just leave it out and still get a two-sided test:
ttestGC(~fastest,data=m111survey,mu=100)
Anytime you want, you can get a graph of the \(P\)-value for your test, simply by setting the argument graph
to TRUE
:
ttestGC(~fastest,data=m111survey,mu=100,
alternative="two.sided",graph=TRUE)
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## fastest 105.9 20.88 71
##
##
## Inferential Results:
##
## Estimate of mu: 105.9
## SE(x.bar): 2.478
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 100.959833 110.842984
##
## Test of Significance:
##
## H_0: mu = 100
## H_a: mu != 100
##
## Test Statistic: t = 2.382
## Degrees of Freedom: 70
## P-value: P = 0.01995
Sometimes you don’t need R to print so much information to the console. If you want only the basics (such as a confidence interval, the test statistic and \(P\)-value), then set the verbose
argument to FALSE
:
ttestGC(~fastest,data=m111survey,mu=100,
alternative="two.sided",verbose=FALSE)
##
##
## Inferential Procedures for One Mean mu:
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 100.959833 110.842984
##
## Test Statistic: t = 2.382
## Degrees of Freedom: 70
## P-value: P = 0.01995
Say that you have taken a simple random sample from some large population, and:
You don’t have the raw data present in a data frame, but you have enough summary data to use ttestGC()
. You just have to set some new arguments:
mean
to the sample mean;sd
to the sample standard deviation;n
to the sample size.So if you only want a 95%-confidence interval for \(\mu\). use:
ttestGC(mean=30,sd=4,n=40)
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## mean sd n
## 30 4 40
##
##
## Inferential Results:
##
## Estimate of mu: 30
## SE(x.bar): 0.6325
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 28.720738 31.279262
If you also want to do a test of significance, again specify mu
and alternative
. For example, to test the hypotheses
\(H_0: \mu = 32\)
\(H_a: \mu < 32\)
use:
ttestGC(mean=30,sd=4,n=40,
mu=32,alternative="less")
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## mean sd n
## 30 4 40
##
##
## Inferential Results:
##
## Estimate of mu: 30
## SE(x.bar): 0.6325
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## -Inf 31.065609
##
## Test of Significance:
##
## H_0: mu = 32
## H_a: mu < 32
##
## Test Statistic: t = -3.162
## Degrees of Freedom: 39
## P-value: P = 0.001514
Suppose
\(\mu_1 =\) mean fastest speed ever driven, by all GC females
\(\mu_2 =\) mean fastest speed ever driven, by all GC males
If you desire, say, an 85%-confidence interval for \(\mu_1 - \mu_2\), then use:
ttestGC(fastest~sex,data=m111survey,
conf.level=0.85)
##
##
## Inferential Procedures for the Difference of Two Means mu1-mu2:
## (Welch's Approximation Used for Degrees of Freedom)
## fastest grouped by sex
##
##
## Descriptive Results:
##
## group mean sd n
## female 100.0 17.61 40
## male 113.5 22.57 31
##
##
## Inferential Results:
##
## Estimate of mu1-mu2: -13.4
## SE(x1.bar - x2.bar): 4.918
##
## 85% Confidence Interval for mu1-mu2:
##
## lower.bound upper.bound
## -20.579953 -6.223273
If you want a 95%-confidence interval for \(\mu_1 - \mu_2\) and you would like to test the hypotheses:
\(H_0: \mu_1 - \mu_2 = 0\)
\(H_a: \mu_1 - \mu_2 \neq 0\)
then use:
ttestGC(fastest~sex,data=m111survey,
mu=0)
##
##
## Inferential Procedures for the Difference of Two Means mu1-mu2:
## (Welch's Approximation Used for Degrees of Freedom)
## fastest grouped by sex
##
##
## Descriptive Results:
##
## group mean sd n
## female 100.0 17.61 40
## male 113.5 22.57 31
##
##
## Inferential Results:
##
## Estimate of mu1-mu2: -13.4
## SE(x1.bar - x2.bar): 4.918
##
## 95% Confidence Interval for mu1-mu2:
##
## lower.bound upper.bound
## -23.254640 -3.548586
##
## Test of Significance:
##
## H_0: mu1-mu2 = 0
## H_a: mu1-mu2 != 0
##
## Test Statistic: t = -2.725
## Degrees of Freedom: 55.49
## P-value: P = 0.008579
Notice that this time:
mu
argument;conf.level
to 0.95 (the default value of the argument is 0.95 already);alternative
is already “two.sided”).Suppose that in the previous situation you had defined:
\(\mu_1 =\) mean fastest speed ever driven, by all GC males
\(\mu_2 =\) mean fastest speed ever driven, by all GC females
Then for you, the first population is all GC males and the second population is all GC females. In order to guarantee that R abides by your choice, use the argument first
:
ttestGC(fastest~sex,data=m111survey,
mu=0,first="male")
##
##
## Inferential Procedures for the Difference of Two Means mu1-mu2:
## (Welch's Approximation Used for Degrees of Freedom)
## fastest grouped by sex
##
##
## Descriptive Results:
##
## group mean sd n
## male 113.5 22.57 31
## female 100.0 17.61 40
##
##
## Inferential Results:
##
## Estimate of mu1-mu2: 13.4
## SE(x1.bar - x2.bar): 4.918
##
## 95% Confidence Interval for mu1-mu2:
##
## lower.bound upper.bound
## 3.548586 23.254640
##
## Test of Significance:
##
## H_0: mu1-mu2 = 0
## H_a: mu1-mu2 != 0
##
## Test Statistic: t = 2.725
## Degrees of Freedom: 55.49
## P-value: P = 0.008579
Suppose that you have taken two independent samples from two populations (or performed a completely randomized experiment with two treatment groups), and you have the following summary data:
Group | \(\bar{x}\) | \(s\) | \(n\) |
---|---|---|---|
group one | 32 | 4.2 | 33 |
group two | 30 | 5.1 | 42 |
You need to provide the summary data to the arguments mean
, sd
and n
, as lists using the c()
function. In each list, data from the first group should come first.
For a 95%-confidence interval for \(\mu_1 - \mu_2\), use:
ttestGC(mean=c(32,30),sd=c(4.2,5.1),n=c(33,42))
##
##
## Inferential Procedures for the Difference of Two Means mu1-mu2:
## (Welch's Approximation Used for Degrees of Freedom)
## Results from summary data.
##
##
## Descriptive Results:
##
## group mean sd n
## Group 1 32 4.2 33
## Group 2 30 5.1 42
##
##
## Inferential Results:
##
## Estimate of mu1-mu2: 2
## SE(x1.bar - x2.bar): 1.074
##
## 95% Confidence Interval for mu1-mu2:
##
## lower.bound upper.bound
## -0.140898 4.140898
Suppose that you want a 90%-confidence interval for \(\mu_1 - \mu_2\) and that you would like to test the hypotheses:
\(H_0: \mu_1 - \mu_2 = 0\)
\(H_a: \mu_1 - \mu_2 > 0\)
Suppose also that want a graph of the \(P\)-value. Then use:
ttestGC(mean=c(32,30),sd=c(4.2,5.1),n=c(33,42),
mu=0,alternative="greater",
conf.level=0.90,graph=TRUE)
##
##
## Inferential Procedures for the Difference of Two Means mu1-mu2:
## (Welch's Approximation Used for Degrees of Freedom)
## Results from summary data.
##
##
## Descriptive Results:
##
## group mean sd n
## Group 1 32 4.2 33
## Group 2 30 5.1 42
##
##
## Inferential Results:
##
## Estimate of mu1-mu2: 2
## SE(x1.bar - x2.bar): 1.074
##
## 90% Confidence Interval for mu1-mu2:
##
## lower.bound upper.bound
## 0.610797 Inf
##
## Test of Significance:
##
## H_0: mu1-mu2 = 0
## H_a: mu1-mu2 > 0
##
## Test Statistic: t = 1.862
## Degrees of Freedom: 72.82
## P-value: P = 0.03333
Suppose that
\(\mu_d =\) mean difference (ideal height minus actual height) for all Georgetown College student.
Both of the relevant variables—ideal_ht and height—are present in the m111survey
data frame.
If you want a 95%-confidence interval for \(\mu_d\), then use:
ttestGC(~ideal_ht - height,data=m111survey)
##
##
## Inferential Procedures for the Difference of Means mu-d:
## ideal_ht minus height
##
##
## Descriptive Results:
##
## Difference mean.difference sd.difference n
## ideal_ht - height 1.946 3.206 69
##
##
## Inferential Results:
##
## Estimate of mu-d: 1.946
## SE(d.bar): 0.3859
##
## 95% Confidence Interval for mu-d:
##
## lower.bound upper.bound
## 1.175528 2.715776
Note how the “~” character signals the presence of a formula.
In order to test:
\(H_0: \mu_d = 0\)
\(H_a: \mu_d > 0\)
use:
ttestGC(~ideal_ht - height,data=m111survey,
mu=0,alternative="greater")
##
##
## Inferential Procedures for the Difference of Means mu-d:
## ideal_ht minus height
##
##
## Descriptive Results:
##
## Difference mean.difference sd.difference n
## ideal_ht - height 1.946 3.206 69
##
##
## Inferential Results:
##
## Estimate of mu-d: 1.946
## SE(d.bar): 0.3859
##
## 95% Confidence Interval for mu-d:
##
## lower.bound upper.bound
## 1.302075 Inf
##
## Test of Significance:
##
## H_0: mu-d = 0
## H_a: mu-d > 0
##
## Test Statistic: t = 5.041
## Degrees of Freedom: 68
## P-value: P = 1.826e-06
Sometimes the difference of the two relevant numerical variables is included in the data frame as another variable: such is the case for m111survey
, where the difference is recorded as the variable diff.ideal.act.
If you would like to use this difference, then work just as if you were studying one population mean \(\mu\).
Thus, for a 95%-confidence interval for \(\mu_d\) we could also have done:
ttestGC(~diff.ideal.act.,data=m111survey)
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## diff.ideal.act. 1.946 3.206 69
##
##
## Inferential Results:
##
## Estimate of mu: 1.946
## SE(x.bar): 0.3859
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 1.175528 2.715776
For a test of
\(H_0: \mu_d = 0\)
\(H_a: \mu_d > 0\)
we could have done:
ttestGC(~diff.ideal.act.,data=m111survey,
mu=0,alternative="greater")
##
##
## Inferential Procedures for One Mean mu:
##
##
## Descriptive Results:
##
## variable mean sd n
## diff.ideal.act. 1.946 3.206 69
##
##
## Inferential Results:
##
## Estimate of mu: 1.946
## SE(x.bar): 0.3859
##
## 95% Confidence Interval for mu:
##
## lower.bound upper.bound
## 1.302075 Inf
##
## Test of Significance:
##
## H_0: mu = 0
## H_a: mu > 0
##
## Test Statistic: t = 5.041
## Degrees of Freedom: 68
## P-value: P = 1.826e-06
Note that in its statement of hypotheses, R identifies the parameter of interest as \(\mu\) rather than as \(\mu_d\). It had no way of knowing that diff.ideal.act. recorded the difference of a pair of measurements.