In this post, we'll briefly learn how to use and do tests with t-test for given sets in R. The tutorial covers:
- A t.test command usage
- Null hypothesis
- T-distribution table
A t.test command usage
We can do a t-test by using the t.test() function in R. Simple usage of t.test() function can be:
t.test(rnorm(10)+5, mu = 4)
One Sample t-test
data: rnorm(10) + 5
t = 2.1038739, df = 9, p-value = 0.06471015
alternative hypothesis: true mean is not equal to 4
95 percent confidence interval:
3.940511891 5.640899209
sample estimates:
mean of x
4.79070555
Here, we've checked a one-sample with 10 randomly generated numbers and indicating mean value mu=4. The output definitions are:
t - a value of t statistics,
df - degree of freedom,
p-value - probability value that is 6.5%.
alternative hypothesis description
95% confidence interval for the mean
Next, we'll generate two sets of data to compare.
set.seed(123) a = rnorm(10)+10 print(a) [1] 9.439524353 9.769822511 11.558708314 10.070508391 [5] 10.129287735 11.715064987 10.460916206 8.734938765 [9] 9.313147148 9.554338030
b = rnorm(10)+11 print(b) [1] 12.224081797 11.359813827 11.400771451 11.110682716 [5] 10.444158865 12.786913137 11.497850478 9.033382843 [9] 11.701355902 10.527208592
Comparing a and b with t.test() function. We'll set a true into the var.equal (variance equal) parameter.
t.test(a, b, var.equal = T)
Two Sample t-test
data: a and b
t = -2.543782, df = 18, p-value = 0.02036269
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.0705694499 -0.1974231836
sample estimates:
mean of x mean of y
10.07462564 11.20862196
The t value of two sets can be calculated with the below formula.
t = (mean(a)-mean(b))/sqrt(sd(a)^2/length(a)+sd(b)^2/length(b))
t = (mean(a)-mean(b))/sqrt(sd(a)^2/length(a)+sd(b)^2/length(b)) print(t) [1] -2.543781976
The result shows that the t is equal to the t-value of the t.test() function.
Null hypothesis
The null hypothesis is an important concept in statistics to explain the tests. It is important to understand the t-test too as the result defines the alternative hypothesis. A null hypothesis, H0 statement defines that the means of the two populations are equal. Otherwise, it becomes an alternative hypothesis, HA or H1.
T distribution table
In R, we can get values of t distribution table with qt() function with specifying probability value and degree of freedom. Getting one-tail t values with a five percent probability.
qt(0.95, df=10)
[1] 1.812461123
A degree of freedom from 1 to 20
qt(0.95, df=1:20)
[1] 6.313751515 2.919985580 2.353363435 2.131846786 2.015048373
[6] 1.943180281 1.894578605 1.859548038 1.833112933 1.812461123
[11] 1.795884819 1.782287556 1.770933396 1.761310136 1.753050356
[16] 1.745883676 1.739606726 1.734063607 1.729132812 1.724718243
In this tutorial, we've briefly learned the t-test with R. Thank you for reading!
No comments:
Post a Comment