## Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

• Knowledge Base

## An Introduction to t Tests | Definitions, Formula and Examples

Published on January 31, 2020 by Rebecca Bevans . Revised on June 22, 2023.

A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.

• The null hypothesis ( H 0 ) is that the true difference between these group means is zero.
• The alternate hypothesis ( H a ) is that the true difference is different from zero.

When to use a t test, what type of t test should i use, performing a t test, interpreting test results, presenting the results of a t test, other interesting articles, frequently asked questions about t tests.

A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an   ANOVA test  or a post-hoc test.

The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests. The t test assumes your data:

• are independent
• are (approximately) normally distributed
• have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance)

If your data do not fit these assumptions, you can try a nonparametric alternative to the t test, such as the Wilcoxon Signed-Rank test for data with unequal variances .

## Here's why students love Scribbr's proofreading services

When choosing a t test, you will need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction.

## One-sample, two-sample, or paired t test?

• If the groups come from a single population (e.g., measuring before and after an experimental treatment), perform a paired t test . This is a within-subjects design .
• If the groups come from two different populations (e.g., two different species, or people from two separate cities), perform a two-sample t test (a.k.a. independent t test ). This is a between-subjects design .
• If there is one group being compared against a standard value (e.g., comparing the acidity of a liquid to a neutral pH of 7), perform a one-sample t test .

## One-tailed or two-tailed t test?

• If you only care whether the two populations are different from one another, perform a two-tailed t test .
• If you want to know whether one population mean is greater than or less than the other, perform a one-tailed t test.
• Your observations come from two separate populations (separate species), so you perform a two-sample t test.
• You don’t care about the direction of the difference, only whether there is a difference, so you choose to use a two-tailed t test.

The t test estimates the true difference between two group means using the ratio of the difference in group means over the pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis software.

## T test formula

The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below.

In this formula, t is the t value, x 1 and x 2 are the means of the two groups being compared, s 2 is the pooled standard error of the two groups, and n 1 and n 2 are the number of observations in each of the groups.

A larger t value shows that the difference between group means is greater than the pooled standard error, indicating a more significant difference between the groups.

You can compare your calculated t value against the values in a critical value chart (e.g., Student’s t table) to determine whether your t value is greater than what would be expected by chance. If so, you can reject the null hypothesis and conclude that the two groups are in fact different.

## T test function in statistical software

Most statistical software (R, SPSS, etc.) includes a t test function. This built-in function will take your raw data and calculate the t value. It will then compare it to the critical value, and calculate a p -value . This way you can quickly see whether your groups are statistically different.

In your comparison of flower petal lengths, you decide to perform your t test using R. The code looks like this:

Sample data set

If you perform the t test for your flower hypothesis in R, you will receive the following output:

The output provides:

• An explanation of what is being compared, called data in the output table.
• The t value : -33.719. Note that it’s negative; this is fine! In most cases, we only care about the absolute value of the difference, or the distance from 0. It doesn’t matter which direction.
• The degrees of freedom : 30.196. Degrees of freedom is related to your sample size, and shows how many ‘free’ data points are available in your test for making comparisons. The greater the degrees of freedom, the better your statistical test will work.
• The p value : 2.2e-16 (i.e. 2.2 with 15 zeros in front). This describes the probability that you would see a t value as large as this one by chance.
• A statement of the alternative hypothesis ( H a ). In this test, the H a is that the difference is not 0.
• The 95% confidence interval . This is the range of numbers within which the true difference in means will be 95% of the time. This can be changed from 95% if you want a larger or smaller interval, but 95% is very commonly used.
• The mean petal length for each group.

When reporting your t test results, the most important values to include are the t value , the p value , and the degrees of freedom for the test. These will communicate to your audience whether the difference between the two groups is statistically significant (a.k.a. that it is unlikely to have happened by chance).

You can also include the summary statistics for the groups being compared, namely the mean and standard deviation . In R, the code for calculating the mean and the standard deviation from the data looks like this:

flower.data %>% group_by(Species) %>% summarize(mean_length = mean(Petal.Length), sd_length = sd(Petal.Length))

In our example, you would report the results like this:

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

• Chi square test of independence
• Statistical power
• Descriptive statistics
• Degrees of freedom
• Pearson correlation
• Null hypothesis

Methodology

• Double-blind study
• Case-control study
• Research ethics
• Data collection
• Hypothesis testing
• Structured interviews

Research bias

• Hawthorne effect
• Unconscious bias
• Recall bias
• Halo effect
• Self-serving bias
• Information bias

A t-test is a statistical test that compares the means of two samples . It is used in hypothesis testing , with a null hypothesis that the difference in group means is zero and an alternate hypothesis that the difference in group means is different from zero.

A t-test measures the difference in group means divided by the pooled standard error of the two group means.

In this way, it calculates a number (the t-value) illustrating the magnitude of the difference between the two group means being compared, and estimates the likelihood that this difference exists purely by chance (p-value).

Your choice of t-test depends on whether you are studying one group or two groups, and whether you care about the direction of the difference in group means.

If you are studying one group, use a paired t-test to compare the group mean over time or after an intervention, or use a one-sample t-test to compare the group mean to a standard value. If you are studying two groups, use a two-sample t-test .

If you want to know only whether a difference exists, use a two-tailed test . If you want to know if one group mean is greater or less than the other, use a left-tailed or right-tailed one-tailed test .

A one-sample t-test is used to compare a single population to a standard value (for example, to determine whether the average lifespan of a specific town is different from the country average).

A paired t-test is used to compare a single population before and after some experimental intervention or at two different points in time (for example, measuring student performance on a test before and after being taught the material).

A t-test should not be used to measure differences among more than two groups, because the error structure for a t-test will underestimate the actual error when many groups are being compared.

If you want to compare the means of several groups at once, it’s best to use another statistical test such as ANOVA or a post-hoc test.

## Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). An Introduction to t Tests | Definitions, Formula and Examples. Scribbr. Retrieved July 22, 2024, from https://www.scribbr.com/statistics/t-test/

## Rebecca Bevans

Other students also liked, choosing the right statistical test | types & examples, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, what is your plagiarism score.

• Quality Improvement
• Talk To Minitab

## Understanding t-Tests: t-values and t-distributions

Topics: Hypothesis Testing , Data Analysis

T-tests are handy hypothesis tests in statistics when you want to compare means. You can compare a sample mean to a hypothesized or target value using a one-sample t-test. You can compare the means of two groups with a two-sample t-test. If you have two groups with paired observations (e.g., before and after measurements), use the paired t-test.

How do t-tests work? How do t-values fit in? In this series of posts, I’ll answer these questions by focusing on concepts and graphs rather than equations and numbers. After all, a key reason to use statistical software like Minitab  is so you don’t get bogged down in the calculations and can instead focus on understanding your results.

In this post, I will explain t-values, t-distributions, and how t-tests use them to calculate probabilities and assess hypotheses.

## What Are t-Values?

T-tests are called t-tests because the test results are all based on t-values. T-values are an example of what statisticians call test statistics. A test statistic is a standardized value that is calculated from sample data during a hypothesis test. The procedure that calculates the test statistic compares your data to what is expected under the null hypothesis .

Each type of t-test uses a specific procedure to boil all of your sample data down to one value, the t-value. The calculations behind t-values compare your sample mean(s) to the null hypothesis and incorporates both the sample size and the variability in the data. A t-value of 0 indicates that the sample results exactly equal the null hypothesis. As the difference between the sample data and the null hypothesis increases, the absolute value of the t-value increases.

Assume that we perform a t-test and it calculates a t-value of 2 for our sample data. What does that even mean? I might as well have told you that our data equal 2 fizbins! We don’t know if that’s common or rare when the null hypothesis is true.

By itself, a t-value of 2 doesn’t really tell us anything. T-values are not in the units of the original data, or anything else we’d be familiar with. We need a larger context in which we can place individual t-values before we can interpret them. This is where t-distributions come in.

## What Are t-Distributions?

When you perform a t-test for a single study, you obtain a single t-value. However, if we drew multiple random samples of the same size from the same population and performed the same t-test, we would obtain many t-values and we could plot a distribution of all of them. This type of distribution is known as a sampling distribution .

Fortunately, the properties of t-distributions are well understood in statistics, so we can plot them without having to collect many samples! A specific t-distribution is defined by its degrees of freedom (DF) , a value closely related to sample size. Therefore, different t-distributions exist for every sample size.  You can graph t-distributions u sing Minitab’s probability distribution plots .

T-distributions assume that you draw repeated random samples from a population where the null hypothesis is true. You place the t-value from your study in the t-distribution to determine how consistent your results are with the null hypothesis.

The graph above shows a t-distribution that has 20 degrees of freedom, which corresponds to a sample size of 21 in a one-sample t-test. It is a symmetric, bell-shaped distribution that is similar to the normal distribution, but with thicker tails. This graph plots the probability density function (PDF), which describes the likelihood of each t-value.

The peak of the graph is right at zero, which indicates that obtaining a sample value close to the null hypothesis is the most likely. That makes sense because t-distributions assume that the null hypothesis is true. T-values become less likely as you get further away from zero in either direction. In other words, when the null hypothesis is true, you are less likely to obtain a sample that is very different from the null hypothesis.

Our t-value of 2 indicates a positive difference between our sample data and the null hypothesis. The graph shows that there is a reasonable probability of obtaining a t-value from -2 to +2 when the null hypothesis is true. Our t-value of 2 is an unusual value, but we don’t know exactly how unusual. Our ultimate goal is to determine whether our t-value is unusual enough to warrant rejecting the null hypothesis. To do that, we'll need to calculate the probability.

## Using t-Values and t-Distributions to Calculate Probabilities

The foundation behind any hypothesis test is being able to take the test statistic from a specific sample and place it within the context of a known probability distribution. For t-tests, if you take a t-value and place it in the context of the correct t-distribution, you can calculate the probabilities associated with that t-value.

A probability allows us to determine how common or rare our t-value is under the assumption that the null hypothesis is true. If the probability is low enough, we can conclude that the effect observed in our sample is inconsistent with the null hypothesis. The evidence in the sample data is strong enough to reject the null hypothesis for the entire population.

Before we calculate the probability associated with our t-value of 2, there are two important details to address.

First, we’ll actually use the t-values of +2 and -2 because we’ll perform a two-tailed test. A two-tailed test is one that can test for differences in both directions. For example, a two-tailed 2-sample t-test can determine whether the difference between group 1 and group 2 is statistically significant in either the positive or negative direction. A one-tailed test can only assess one of those directions.

Second, we can only calculate a non-zero probability for a range of t-values. As you’ll see in the graph below, a range of t-values corresponds to a proportion of the total area under the distribution curve, which is the probability. The probability for any specific point value is zero because it does not produce an area under the curve.

With these points in mind, we’ll shade the area of the curve that has t-values greater than 2 and t-values less than -2.

The graph displays the probability for observing a difference from the null hypothesis that is at least as extreme as the difference present in our sample data while assuming that the null hypothesis is actually true. Each of the shaded regions has a probability of 0.02963, which sums to a total probability of 0.05926. When the null hypothesis is true, the t-value falls within these regions nearly 6% of the time.

This probability has a name that you might have heard of—it’s called the p-value!  While the probability of our t-value falling within these regions is fairly low, it’s not low enough to reject the null hypothesis using the common significance level of 0.05.

Learn how to correctly interpret the p-value.

## t-Distributions and Sample Size

As mentioned above, t-distributions are defined by the DF, which are closely associated with sample size. As the DF increases, the probability density in the tails decreases and the distribution becomes more tightly clustered around the central value. The graph below depicts t-distributions with 5 and 30 degrees of freedom.

The t-distribution with fewer degrees of freedom has thicker tails. This occurs because the t-distribution is designed to reflect the added uncertainty associated with analyzing small samples. In other words, if you have a small sample, the probability that the sample statistic will be further away from the null hypothesis is greater even when the null hypothesis is true.

Small samples are more likely to be unusual. This affects the probability associated with any given t-value. For 5 and 30 degrees of freedom, a t-value of 2 in a two-tailed test has p-values of 10.2% and 5.4%, respectively. Large samples are better!

I’ve showed how t-values and t-distributions work together to produce probabilities. To see how each type of t-test works and actually calculates the t-values, read the other post in this series, Understanding t-Tests: 1-sample, 2-sample, and Paired t-Tests .

If you'd like to learn how the ANOVA F-test works, read my post, Understanding Analysis of Variance (ANOVA) and the F-test .

• Trust Center

## t-test Calculator

Welcome to our t-test calculator! Here you can not only easily perform one-sample t-tests , but also two-sample t-tests , as well as paired t-tests .

Do you prefer to find the p-value from t-test, or would you rather find the t-test critical values? Well, this t-test calculator can do both! 😊

What does a t-test tell you? Take a look at the text below, where we explain what actually gets tested when various types of t-tests are performed. Also, we explain when to use t-tests (in particular, whether to use the z-test vs. t-test) and what assumptions your data should satisfy for the results of a t-test to be valid. If you've ever wanted to know how to do a t-test by hand, we provide the necessary t-test formula, as well as tell you how to determine the number of degrees of freedom in a t-test.

## When to use a t-test?

A t-test is one of the most popular statistical tests for location , i.e., it deals with the population(s) mean value(s).

There are different types of t-tests that you can perform:

• A one-sample t-test;
• A two-sample t-test; and
• A paired t-test.

In the next section , we explain when to use which. Remember that a t-test can only be used for one or two groups . If you need to compare three (or more) means, use the analysis of variance ( ANOVA ) method.

The t-test is a parametric test, meaning that your data has to fulfill some assumptions :

• The data points are independent; AND
• The data, at least approximately, follow a normal distribution .

If your sample doesn't fit these assumptions, you can resort to nonparametric alternatives. Visit our Mann–Whitney U test calculator or the Wilcoxon rank-sum test calculator to learn more. Other possibilities include the Wilcoxon signed-rank test or the sign test.

## Which t-test?

Your choice of t-test depends on whether you are studying one group or two groups:

One sample t-test

Choose the one-sample t-test to check if the mean of a population is equal to some pre-set hypothesized value .

The average volume of a drink sold in 0.33 l cans — is it really equal to 330 ml?

The average weight of people from a specific city — is it different from the national average?

## Two-sample t-test

Choose the two-sample t-test to check if the difference between the means of two populations is equal to some pre-determined value when the two samples have been chosen independently of each other.

In particular, you can use this test to check whether the two groups are different from one another .

The average difference in weight gain in two groups of people: one group was on a high-carb diet and the other on a high-fat diet.

The average difference in the results of a math test from students at two different universities.

This test is sometimes referred to as an independent samples t-test , or an unpaired samples t-test .

## Paired t-test

A paired t-test is used to investigate the change in the mean of a population before and after some experimental intervention , based on a paired sample, i.e., when each subject has been measured twice: before and after treatment.

In particular, you can use this test to check whether, on average, the treatment has had any effect on the population .

The change in student test performance before and after taking a course.

The change in blood pressure in patients before and after administering some drug.

## How to do a t-test?

So, you've decided which t-test to perform. These next steps will tell you how to calculate the p-value from t-test or its critical values, and then which decision to make about the null hypothesis.

Decide on the alternative hypothesis :

Use a two-tailed t-test if you only care whether the population's mean (or, in the case of two populations, the difference between the populations' means) agrees or disagrees with the pre-set value.

Use a one-tailed t-test if you want to test whether this mean (or difference in means) is greater/less than the pre-set value.

Formulas for the test statistic in t-tests include the sample size , as well as its mean and standard deviation . The exact formula depends on the t-test type — check the sections dedicated to each particular test for more details.

Determine the degrees of freedom for the t-test:

The degrees of freedom are the number of observations in a sample that are free to vary as we estimate statistical parameters. In the simplest case, the number of degrees of freedom equals your sample size minus the number of parameters you need to estimate . Again, the exact formula depends on the t-test you want to perform — check the sections below for details.

The degrees of freedom are essential, as they determine the distribution followed by your T-score (under the null hypothesis). If there are d degrees of freedom, then the distribution of the test statistics is the t-Student distribution with d degrees of freedom . This distribution has a shape similar to N(0,1) (bell-shaped and symmetric) but has heavier tails . If the number of degrees of freedom is large (>30), which generically happens for large samples, the t-Student distribution is practically indistinguishable from N(0,1).

💡 The t-Student distribution owes its name to William Sealy Gosset, who, in 1908, published his paper on the t-test under the pseudonym "Student". Gosset worked at the famous Guinness Brewery in Dublin, Ireland, and devised the t-test as an economical way to monitor the quality of beer. Cheers! 🍺🍺🍺

## p-value from t-test

Recall that the p-value is the probability (calculated under the assumption that the null hypothesis is true) that the test statistic will produce values at least as extreme as the T-score produced for your sample . As probabilities correspond to areas under the density function, p-value from t-test can be nicely illustrated with the help of the following pictures:

The following formulae say how to calculate p-value from t-test. By cdf t,d we denote the cumulative distribution function of the t-Student distribution with d degrees of freedom:

p-value from left-tailed t-test:

p-value = cdf t,d (t score )

p-value from right-tailed t-test:

p-value = 1 − cdf t,d (t score )

p-value from two-tailed t-test:

p-value = 2 × cdf t,d (−|t score |)

or, equivalently: p-value = 2 − 2 × cdf t,d (|t score |)

However, the cdf of the t-distribution is given by a somewhat complicated formula. To find the p-value by hand, you would need to resort to statistical tables, where approximate cdf values are collected, or to specialized statistical software. Fortunately, our t-test calculator determines the p-value from t-test for you in the blink of an eye!

## t-test critical values

Recall, that in the critical values approach to hypothesis testing, you need to set a significance level, α, before computing the critical values , which in turn give rise to critical regions (a.k.a. rejection regions).

Formulas for critical values employ the quantile function of t-distribution, i.e., the inverse of the cdf :

Critical value for left-tailed t-test: cdf t,d -1 (α)

critical region:

(-∞, cdf t,d -1 (α)]

Critical value for right-tailed t-test: cdf t,d -1 (1-α)

[cdf t,d -1 (1-α), ∞)

Critical values for two-tailed t-test: ±cdf t,d -1 (1-α/2)

(-∞, -cdf t,d -1 (1-α/2)] ∪ [cdf t,d -1 (1-α/2), ∞)

To decide the fate of the null hypothesis, just check if your T-score lies within the critical region:

If your T-score belongs to the critical region , reject the null hypothesis and accept the alternative hypothesis.

If your T-score is outside the critical region , then you don't have enough evidence to reject the null hypothesis.

## How to use our t-test calculator

Choose the type of t-test you wish to perform:

A one-sample t-test (to test the mean of a single group against a hypothesized mean);

A two-sample t-test (to compare the means for two groups); or

A paired t-test (to check how the mean from the same group changes after some intervention).

Two-tailed;

Left-tailed; or

Right-tailed.

This t-test calculator allows you to use either the p-value approach or the critical regions approach to hypothesis testing!

Enter your T-score and the number of degrees of freedom . If you don't know them, provide some data about your sample(s): sample size, mean, and standard deviation, and our t-test calculator will compute the T-score and degrees of freedom for you .

Once all the parameters are present, the p-value, or critical region, will immediately appear underneath the t-test calculator, along with an interpretation!

## One-sample t-test

The null hypothesis is that the population mean is equal to some value μ 0 \mu_0 μ 0 ​ .

The alternative hypothesis is that the population mean is:

• different from μ 0 \mu_0 μ 0 ​ ;
• smaller than μ 0 \mu_0 μ 0 ​ ; or
• greater than μ 0 \mu_0 μ 0 ​ .

One-sample t-test formula :

• μ 0 \mu_0 μ 0 ​ — Mean postulated in the null hypothesis;
• n n n — Sample size;
• x ˉ \bar{x} x ˉ — Sample mean; and
• s s s — Sample standard deviation.

Number of degrees of freedom in t-test (one-sample) = n − 1 n-1 n − 1 .

The null hypothesis is that the actual difference between these groups' means, μ 1 \mu_1 μ 1 ​ , and μ 2 \mu_2 μ 2 ​ , is equal to some pre-set value, Δ \Delta Δ .

The alternative hypothesis is that the difference μ 1 − μ 2 \mu_1 - \mu_2 μ 1 ​ − μ 2 ​ is:

• Different from Δ \Delta Δ ;
• Smaller than Δ \Delta Δ ; or
• Greater than Δ \Delta Δ .

In particular, if this pre-determined difference is zero ( Δ = 0 \Delta = 0 Δ = 0 ):

The null hypothesis is that the population means are equal.

The alternate hypothesis is that the population means are:

• μ 1 \mu_1 μ 1 ​ and μ 2 \mu_2 μ 2 ​ are different from one another;
• μ 1 \mu_1 μ 1 ​ is smaller than μ 2 \mu_2 μ 2 ​ ; and
• μ 1 \mu_1 μ 1 ​ is greater than μ 2 \mu_2 μ 2 ​ .

Formally, to perform a t-test, we should additionally assume that the variances of the two populations are equal (this assumption is called the homogeneity of variance ).

There is a version of a t-test that can be applied without the assumption of homogeneity of variance: it is called a Welch's t-test . For your convenience, we describe both versions.

## Two-sample t-test if variances are equal

Use this test if you know that the two populations' variances are the same (or very similar).

Two-sample t-test formula (with equal variances) :

where s p s_p s p ​ is the so-called pooled standard deviation , which we compute as:

• Δ \Delta Δ — Mean difference postulated in the null hypothesis;
• n 1 n_1 n 1 ​ — First sample size;
• x ˉ 1 \bar{x}_1 x ˉ 1 ​ — Mean for the first sample;
• s 1 s_1 s 1 ​ — Standard deviation in the first sample;
• n 2 n_2 n 2 ​ — Second sample size;
• x ˉ 2 \bar{x}_2 x ˉ 2 ​ — Mean for the second sample; and
• s 2 s_2 s 2 ​ — Standard deviation in the second sample.

Number of degrees of freedom in t-test (two samples, equal variances) = n 1 + n 2 − 2 n_1 + n_2 - 2 n 1 ​ + n 2 ​ − 2 .

## Two-sample t-test if variances are unequal (Welch's t-test)

Use this test if the variances of your populations are different.

Two-sample Welch's t-test formula if variances are unequal:

• s 1 s_1 s 1 ​ — Standard deviation in the first sample;
• s 2 s_2 s 2 ​ — Standard deviation in the second sample.

The number of degrees of freedom in a Welch's t-test (two-sample t-test with unequal variances) is very difficult to count. We can approximate it with the help of the following Satterthwaite formula :

Alternatively, you can take the smaller of n 1 − 1 n_1 - 1 n 1 ​ − 1 and n 2 − 1 n_2 - 1 n 2 ​ − 1 as a conservative estimate for the number of degrees of freedom.

🔎 The Satterthwaite formula for the degrees of freedom can be rewritten as a scaled weighted harmonic mean of the degrees of freedom of the respective samples: n 1 − 1 n_1 - 1 n 1 ​ − 1 and n 2 − 1 n_2 - 1 n 2 ​ − 1 , and the weights are proportional to the standard deviations of the corresponding samples.

As we commonly perform a paired t-test when we have data about the same subjects measured twice (before and after some treatment), let us adopt the convention of referring to the samples as the pre-group and post-group.

The null hypothesis is that the true difference between the means of pre- and post-populations is equal to some pre-set value, Δ \Delta Δ .

The alternative hypothesis is that the actual difference between these means is:

Typically, this pre-determined difference is zero. We can then reformulate the hypotheses as follows:

The null hypothesis is that the pre- and post-means are the same, i.e., the treatment has no impact on the population .

The alternative hypothesis:

• The pre- and post-means are different from one another (treatment has some effect);
• The pre-mean is smaller than the post-mean (treatment increases the result); or
• The pre-mean is greater than the post-mean (treatment decreases the result).

Paired t-test formula

In fact, a paired t-test is technically the same as a one-sample t-test! Let us see why it is so. Let x 1 , . . . , x n x_1, ... , x_n x 1 ​ , ... , x n ​ be the pre observations and y 1 , . . . , y n y_1, ... , y_n y 1 ​ , ... , y n ​ the respective post observations. That is, x i , y i x_i, y_i x i ​ , y i ​ are the before and after measurements of the i -th subject.

For each subject, compute the difference, d i : = x i − y i d_i := x_i - y_i d i ​ := x i ​ − y i ​ . All that happens next is just a one-sample t-test performed on the sample of differences d 1 , . . . , d n d_1, ... , d_n d 1 ​ , ... , d n ​ . Take a look at the formula for the T-score :

Δ \Delta Δ — Mean difference postulated in the null hypothesis;

n n n — Size of the sample of differences, i.e., the number of pairs;

x ˉ \bar{x} x ˉ — Mean of the sample of differences; and

s s s  — Standard deviation of the sample of differences.

Number of degrees of freedom in t-test (paired): n − 1 n - 1 n − 1

## t-test vs Z-test

We use a Z-test when we want to test the population mean of a normally distributed dataset, which has a known population variance . If the number of degrees of freedom is large, then the t-Student distribution is very close to N(0,1).

Hence, if there are many data points (at least 30), you may swap a t-test for a Z-test, and the results will be almost identical. However, for small samples with unknown variance, remember to use the t-test because, in such cases, the t-Student distribution differs significantly from the N(0,1)!

🙋 Have you concluded you need to perform the z-test? Head straight to our z-test calculator !

## What is a t-test?

A t-test is a widely used statistical test that analyzes the means of one or two groups of data. For instance, a t-test is performed on medical data to determine whether a new drug really helps.

## What are different types of t-tests?

Different types of t-tests are:

• One-sample t-test;
• Two-sample t-test; and
• Paired t-test.

## How to find the t value in a one sample t-test?

To find the t-value:

• Subtract the null hypothesis mean from the sample mean value.
• Divide the difference by the standard deviation of the sample.
• Multiply the resultant with the square root of the sample size.

Choose test type

t-test for the population mean, μ, based on one independent sample . Null hypothesis H 0 : μ = μ 0

Alternative hypothesis H 1

## Test details

Significance level α

The probability that we reject a true H 0 (type I error).

Degrees of freedom

Calculated as sample size minus one.

## T Test (Student’s T-Test): Definition and Examples

T Test: Contents :

• What is a T Test?
• The T Score
• T Values and P Values
• Calculating the T Test
• What is a Paired T Test (Paired Samples T Test)?

## What is a T test?

The t test tells you how significant the differences between group means are. It lets you know if those differences in means could have happened by chance. The t test is usually used when data sets follow a normal distribution but you don’t know the population variance .

For example, you might flip a coin 1,000 times and find the number of heads follows a normal distribution for all trials. So you can calculate the sample variance from this data, but the population variance is unknown. Or, a drug company may want to test a new cancer drug to find out if it improves life expectancy. In an experiment, there’s always a control group (a group who are given a placebo, or “sugar pill”). So while the control group may show an average life expectancy of +5 years, the group taking the new drug might have a life expectancy of +6 years. It would seem that the drug might work. But it could be due to a fluke. To test this, researchers would use a Student’s t-test to find out if the results are repeatable for an entire population.

In addition, a t test uses a t-statistic and compares this to t-distribution values to determine if the results are statistically significant .

However, note that you can only uses a t test to compare two means. If you want to compare three or more means, use an ANOVA instead.

## The T Score.

The t score is a ratio between the difference between two groups and the difference within the groups .

• Larger t scores = more difference between groups.
• Smaller t score = more similarity between groups.

A t score of 3 tells you that the groups are three times as different from each other as they are within each other. So when you run a t test, bigger t-values equal a greater probability that the results are repeatable.

## T-Values and P-values

How big is “big enough”? Every t-value has a p-value to go with it. A p-value from a t test is the probability that the results from your sample data occurred by chance. P-values are from 0% to 100% and are usually written as a decimal (for example, a p value of 5% is 0.05). Low p-values indicate your data did not occur by chance . For example, a p-value of .01 means there is only a 1% probability that the results from an experiment happened by chance.

## Calculating the Statistic / Test Types

There are three main types of t-test:

• An Independent Samples t-test compares the means for two groups.
• A Paired sample t-test compares means from the same group at different times (say, one year apart).
• A One sample t-test tests the mean of a single group against a known mean.

You can find the steps for an independent samples t test here . But you probably don’t want to calculate the test by hand (the math can get very messy. Use the following tools to calculate the t test:

• How to do a T test in Excel.
• T test in SPSS.
• T-distribution on the TI 89.
• T distribution on the TI 83.

## What is a Paired T Test (Paired Samples T Test / Dependent Samples T Test)?

A paired t test (also called a correlated pairs t-test , a paired samples t test or dependent samples t test ) is where you run a t test on dependent samples. Dependent samples are essentially connected — they are tests on the same person or thing. For example:

• Knee MRI costs at two different hospitals,
• Two tests on the same person before and after training,
• Two blood pressure measurements on the same person using different equipment.

## When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test

Choose the paired t-test if you have two measurements on the same item, person or thing. But you should also choose this test if you have two items that are being measured with a unique condition. For example, you might be measuring car safety performance in vehicle research and testing and subject the cars to a series of crash tests. Although the manufacturers are different, you might be subjecting them to the same conditions.

With a “regular” two sample t test , you’re comparing the means for two different samples . For example, you might test two different groups of customer service associates on a business-related test or testing students from two universities on their English skills. But if you take a random sample each group separately and they have different conditions, your samples are independent and you should run an independent samples t test (also called between-samples and unpaired-samples).

The null hypothesis for the independent samples t-test is μ 1 = μ 2 . So it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H 0 : µ d = 0).

## Paired Samples T Test By hand

• The “ΣD” is the sum of X-Y from Step 2.
• ΣD 2 : Sum of the squared differences (from Step 4).
• (ΣD) 2 : Sum of the differences (from Step 2), squared.

If you’re unfamiliar with the Σ notation used in the t test, it basically means to “add everything up”. You may find this article useful: summation notation .

Step 6: Subtract 1 from the sample size to get the degrees of freedom. We have 11 items. So 11 – 1 = 10.

Step 7: Find the p-value in the t-table , using the degrees of freedom in Step 6. But if you don’t have a specified alpha level , use 0.05 (5%).

So for this example t test problem, with df = 10, the t-value is 2.228.

Step 8: In conclusion, compare your t-table value from Step 7 (2.228) to your calculated t-value (-2.74). The calculated t-value is greater than the table value at an alpha level of .05. In addition, note that the p-value is less than the alpha level: p <.05. So we can reject the null hypothesis that there is no difference between means.

However, note that you can ignore the minus sign when comparing the two t-values as ± indicates the direction; the p-value remains the same for both directions.

In addition, check out our YouTube channel for more stats help and tips!

Goulden, C. H. Methods of Statistical Analysis, 2nd ed. New York: Wiley, pp. 50-55, 1956.

Teach yourself statistics

## Student's t Distribution

The t distribution (aka, Student’s t-distribution ) is a probability distribution that is used to estimate population parameters when the sample size is small and/or when the population variance is unknown.

## Why Use the t Distribution?

According to the central limit theorem , the sampling distribution of a statistic (like a sample mean) will follow a normal distribution , as long as the sample size is sufficiently large. Therefore, when we know the standard deviation of the population, we can compute a z-score , and use the normal distribution to evaluate probabilities with the sample mean.

But sample sizes are sometimes small, and often we do not know the standard deviation of the population. When either of these problems occur, statisticians rely on the distribution of the t statistic (also known as the t score ), whose values are given by:

t = [ x - μ ] / [ s / sqrt( n ) ]

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. The distribution of the t statistic is called the t distribution or the Student t distribution .

The t distribution allows us to conduct statistical analyses on certain data sets that are not appropriate for analysis, using the normal distribution.

## Degrees of Freedom

There are actually many different t distributions. The particular form of the t distribution is determined by its degrees of freedom . The degrees of freedom refers to the number of independent observations in a set of data.

When estimating a mean score or a proportion from a single sample, the number of independent observations is equal to the sample size minus one. Hence, the distribution of the t statistic from samples of size 8 would be described by a t distribution having 8 - 1 or 7 degrees of freedom. Similarly, the distribution of the t statistic from samples of size 16 would be described by a t distribution having 16 - 1 or 15 degrees of freedom.

For other applications, the degrees of freedom may be calculated differently. We will describe those computations as they come up.

## Properties of the t Distribution

The t distribution has the following properties:

• The mean of the distribution is equal to 0 .
• The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section) and v > 2.
• The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom.
• With infinite degrees of freedom, the t distribution is the same as the standard normal distribution .

## When to Use the t Distribution

The t distribution can be used with any statistic having a bell-shaped distribution (i.e., approximately normal). It is reasonable to assume that the sampling distribution of a statistic will be bell-shaped when any of the following conditions apply.

• The population distribution is normal.
• The population distribution is symmetric , unimodal , without outliers , and the sample size is at least 30.
• The population distribution is moderately skewed , unimodal, without outliers, and the sample size is at least 40.
• The sample size is greater than 50, without outliers.

The t distribution should not be used with small samples from populations that are not approximately normal.

## Probability and the Student t Distribution

When a sample of size n is drawn from a population having a normal (or nearly normal) distribution, the sample mean can be transformed into a t statistic, using the equation presented at the beginning of this lesson. We repeat that equation below:

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size, and degrees of freedom are equal to n - 1.

The t statistic produced by this transformation can be associated with a unique cumulative probability . This cumulative probability represents the likelihood of finding a sample mean less than or equal to x , given a random sample of size n .

To find the probability associated with a t statistic, use a t distribution table (found in the appendix of most introductory statistics texts), a graphing calculator, an online t distribution calculator, like Stat Trek's T Distribution Calculator .

## T Distribution Calculator

The T Distribution Calculator solves common statistics problems, based on the t distribution. The calculator computes cumulative probabilities, based on simple inputs. Clear instructions guide you to an accurate solution, quickly and easily. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. The calculator is free. It can found in the Stat Trek main menu under the Stat Tools tab. Or you can tap the button below.

## Notation and t Statistics

Statisticians use t α to represent the t statistic that has a cumulative probability of (1 - α). For example, suppose we were interested in the t statistic having a cumulative probability of 0.95. In this example, α would be equal to (1 - 0.95) or 0.05. We would refer to the t statistic as t 0.05

Of course, the value of t 0.05 depends on the number of degrees of freedom. For example, with 2 degrees of freedom, t 0.05 is equal to 2.92; but with 20 degrees of freedom, t 0.05 is equal to 1.725.

Note: Because the t distribution is symmetric about a mean of zero, the following is true.

t α = - t 1 - alpha       And       t 1 - alpha = - t α

Thus, if t 0.05 = 2.92, then t 0.95 = -2.92.

Acme Corporation manufactures light bulbs. The CEO claims that an average Acme light bulb lasts 300 days. A researcher randomly selects 15 bulbs for testing. The sampled bulbs last an average of 290 days, with a standard deviation of 50 days. If the CEO's claim were true, what is the probability that 15 randomly selected bulbs would have an average life of no more than 290 days?

Note: There are two ways to solve this problem, using the T Distribution Calculator . Both approaches are presented below. Solution A is the traditional approach. It requires you to compute the t statistic, based on data presented in the problem description. Then, you use the T Distribution Calculator to find the probability. Solution B is easier. You simply enter the problem data into the T Distribution Calculator. The calculator computes a t statistic "behind the scenes", and displays the probability. Both approaches come up with exactly the same answer.

The first thing we need to do is compute the t statistic, based on the following equation:

t = [ x - μ ] / [ s / sqrt( n ) ] t = ( 290 - 300 ) / [ 50 / sqrt( 15) ] t = -10 / 12.909945 = - 0.7745966

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size.

Now, we are ready to use the T Distribution Calculator . Since we know the t statistic, we select "t score" from the Random Variable dropdown box. Then, we enter the following data:

• The degrees of freedom are equal to 15 - 1 = 14.
• The t statistic is equal to - 0.7745966.

The calculator displays the cumulative probability: 0.22573.

Hence, if the true bulb life were 300 days, there is about a 22.6% chance that the average bulb life for 15 randomly selected bulbs would be less than or equal to 290 days.

Solution B:

This time, we will not compute the t statistic manually; the T Distribution Calculator will do that work for us. We select "mean score" from the Random Variable dropdown box. Then, we enter the following data:

• Assuming the CEO's claim is true, the population mean equals 300.
• The sample mean equals 290.
• The standard deviation of the sample is 50.

Hence, there is a 22.6% chance that the average sampled light bulb will burn out within 290 days.

Suppose scores on an IQ test are normally distributed, with a population mean of 100. Suppose 20 people are randomly selected and tested. The standard deviation in the sample group is 15. What is the probability that the average test score in the sample group will be at most 110?

To solve this problem, we will not compute the t statistic; the T Distribution Calculator will do that work for us. We select "mean score" from the Random Variable dropdown box. Then, we enter the following data:

• The degrees of freedom are equal to 20 - 1 = 19.
• The population mean equals 100.
• The sample mean equals 110.
• The standard deviation of the sample is 15.

We enter these values into the T Distribution Calculator .

The calculator displays the cumulative probability: 0.99616. Hence, there is a 99.6% chance that the sample average will be no greater than 110.

• Math Formulas
• T Test Formula

## T-Test Formula

The t-test is any statistical hypothesis test in which the test statistic follows a Student’s t-distribution under the null hypothesis. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known.

T-test uses means and standard deviations of two samples to make a comparison. The formula for T-test is given below:

\begin{array}{l}\qquad t=\frac{\bar{X}_{1}-\bar{X}_{2}}{s_{\bar{\Delta}}} \\ \text { where } \\ \qquad s_{\bar{\Delta}}=\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}} \\ \end{array}

Where, $$\begin{array}{l}\overline{x}\end{array}$$ = Mean of first set of values $$\begin{array}{l}\overline{x}_{2}\end{array}$$  = Mean of second set of values $$\begin{array}{l}S_{1}\end{array}$$   = Standard deviation of first set of values $$\begin{array}{l}S_{2}\end{array}$$   = Standard deviation of second set of values $$\begin{array}{l}n_{1}\end{array}$$   = Total number of values in first set $$\begin{array}{l}n_{2}\end{array}$$   = Total number of values in second set.

The formula for standard deviation is given by:

Where, x = Values given $$\begin{array}{l}\overline{x}\end{array}$$ = Mean n = Total number of values.

## T-Test Solved Examples

Question 1: Find the t-test value for the following two sets of values: 7, 2, 9, 8 and 1, 2, 3, 4?

Formula for standard deviation:  $$\begin{array}{l}S=\sqrt{\frac{\sum\left(x-\overline{x}\right)^{2}}{n-1}}\end{array}$$

Number of terms in first set:  $$\begin{array}{l}n_{1}\end{array}$$ = 4

Mean for first set of data: $$\begin{array}{l}\overline{x}_{1}\end{array}$$ = 6.5

Construct the following table for standard deviation:

 7 0.5 0.25 2 -4.5 20.25 9 2.5 6.25 8 1.5 2.25

Standard deviation for the first set of data: S 1 = 3.11

Number of terms in second set: n 2 = 4

 1 -1.5 2.25 2 -0.5 0.25 3 0.5 0.25 4 1.5 2.25

Standard deviation for first set of data: $$\begin{array}{l}S_{2}\end{array}$$ = 1.29

Formula for t-test value:

t = 2.3764 = 2.36 (approx)

 More topics in T Test Formula

Request OTP on Voice Call

Post My Comment

Register with byju's & watch live videos.

• Search Search Please fill out this field.

## What Is a T-Test?

Understanding the t-test, using a t-test, which t-test to use.

• T-Test FAQs
• Fundamental Analysis

## T-Test: What It Is With Multiple Formulas and When To Use Them

Read how this calculation can be used for hypothesis testing in statistics

Adam Hayes, Ph.D., CFA, is a financial writer with 15+ years Wall Street experience as a derivatives trader. Besides his extensive derivative trading expertise, Adam is an expert in economics and behavioral finance. Adam received his master's in economics from The New School for Social Research and his Ph.D. from the University of Wisconsin-Madison in sociology. He is a CFA charterholder as well as holding FINRA Series 7, 55 & 63 licenses. He currently researches and teaches economic sociology and the social studies of finance at the Hebrew University in Jerusalem.

A t-test is an inferential statistic used to determine if there is a significant difference between the means of two groups and how they are related. T-tests are used when the data sets follow a normal distribution and have unknown variances, like the data set recorded from flipping a coin 100 times.

The t-test is a test used for hypothesis testing in statistics and uses the t-statistic, the t-distribution values, and the degrees of freedom to determine statistical significance.

## Key Takeaways

• A t-test is an inferential statistic used to determine if there is a statistically significant difference between the means of two variables.
• The t-test is a test used for hypothesis testing in statistics.
• Calculating a t-test requires three fundamental data values including the difference between the mean values from each data set, the standard deviation of each group, and the number of data values.
• T-tests can be dependent or independent.

Investopedia / Sabrina Jiang

A t-test compares the average values of two data sets and determines if they came from the same population. In the above examples, a sample of students from class A and a sample of students from class B would not likely have the same mean and standard deviation. Similarly, samples taken from the placebo-fed control group and those taken from the drug prescribed group should have a slightly different mean and standard deviation.

Mathematically, the t-test takes a sample from each of the two sets and establishes the problem statement. It assumes a null hypothesis that the two means are equal.

Using the formulas, values are calculated and compared against the standard values. The assumed null hypothesis is accepted or rejected accordingly. If the null hypothesis qualifies to be rejected, it indicates that data readings are strong and are probably not due to chance.

The t-test is just one of many tests used for this purpose. Statisticians use additional tests other than the t-test to examine more variables and larger sample sizes. For a large sample size, statisticians use a  z-test . Other testing options include the chi-square test and the f-test.

Consider that a drug manufacturer tests a new medicine. Following standard procedure, the drug is given to one group of patients and a placebo to another group called the control group. The placebo is a substance with no therapeutic value and serves as a benchmark to measure how the other group, administered the actual drug, responds.

After the drug trial, the members of the placebo-fed control group reported an increase in average life expectancy of three years, while the members of the group who are prescribed the new drug reported an increase in average life expectancy of four years.

Initial observation indicates that the drug is working. However, it is also possible that the observation may be due to chance. A t-test can be used to determine if the results are correct and applicable to the entire population.

Four assumptions are made while using a t-test. The data collected must follow a continuous or ordinal scale, such as the scores for an IQ test, the data is collected from a randomly selected portion of the total population, the data will result in a normal distribution of a bell-shaped curve, and equal or homogenous variance exists when the standard variations are equal.

## T-Test Formula

Calculating a t-test requires three fundamental data values. They include the difference between the mean values from each data set, or the mean difference, the standard deviation of each group, and the number of data values of each group.

This comparison helps to determine the effect of chance on the difference, and whether the difference is outside that chance range. The t-test questions whether the difference between the groups represents a true difference in the study or merely a random difference.

The t-test produces two values as its output: t-value and degrees of freedom . The t-value, or t-score, is a ratio of the difference between the mean of the two sample sets and the variation that exists within the sample sets.

The numerator value is the difference between the mean of the two sample sets. The denominator is the variation that exists within the sample sets and is a measurement of the dispersion or variability.

This calculated t-value is then compared against a value obtained from a critical value table called the T-distribution table. Higher values of the t-score indicate that a large difference exists between the two sample sets. The smaller the t-value, the more similarity exists between the two sample sets.

A large t-score, or t-value, indicates that the groups are different while a small t-score indicates that the groups are similar.

Degrees of freedom refer to the values in a study that has the freedom to vary and are essential for assessing the importance and the validity of the null hypothesis. Computation of these values usually depends upon the number of data records available in the sample set.

## Paired Sample T-Test

The correlated t-test, or paired t-test, is a dependent type of test and is performed when the samples consist of matched pairs of similar units, or when there are cases of repeated measures. For example, there may be instances where the same patients are repeatedly tested before and after receiving a particular treatment. Each patient is being used as a control sample against themselves.

This method also applies to cases where the samples are related or have matching characteristics, like a comparative analysis involving children, parents, or siblings.

The formula for computing the t-value and degrees of freedom for a paired t-test is:

T = mean 1 − mean 2 s ( diff ) ( n ) where: mean 1  and  mean 2 = The average values of each of the sample sets s ( diff ) = The standard deviation of the differences of the paired data values n = The sample size (the number of paired differences) n − 1 = The degrees of freedom \begin{aligned}&T=\frac{\textit{mean}1 - \textit{mean}2}{\frac{s(\text{diff})}{\sqrt{(n)}}}\\&\textbf{where:}\\&\textit{mean}1\text{ and }\textit{mean}2=\text{The average values of each of the sample sets}\\&s(\text{diff})=\text{The standard deviation of the differences of the paired data values}\\&n=\text{The sample size (the number of paired differences)}\\&n-1=\text{The degrees of freedom}\end{aligned} ​ T = ( n ) ​ s ( diff ) ​ mean 1 − mean 2 ​ where: mean 1  and  mean 2 = The average values of each of the sample sets s ( diff ) = The standard deviation of the differences of the paired data values n = The sample size (the number of paired differences) n − 1 = The degrees of freedom ​

## Equal Variance or Pooled T-Test

The equal variance t-test is an independent t-test and is used when the number of samples in each group is the same, or the variance of the two data sets is similar.

The formula used for calculating t-value and degrees of freedom for equal variance t-test is:

T-value = mean 1 − mean 2 ( n 1 − 1 ) × var 1 2 + ( n 2 − 1 ) × var 2 2 n 1 + n 2 − 2 × 1 n 1 + 1 n 2 where: mean 1  and  mean 2 = Average values of each of the sample sets var 1  and  var 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set \begin{aligned}&\text{T-value}=\frac{\textit{mean}1-\textit{mean}2}{\sqrt{\frac{(n1-1)\times\textit{var}1^2+(n2-1)\times\textit{var}2^2}{n1+n2-2}\times\frac{1}{n1}+\frac{1}{n2}}}\\&\textbf{where:}\\&\textit{mean}1 \text{ and } \textit{mean}2=\text{Average values of each}\\&\text{of the sample sets}\\&\textit{var}1\text{ and }\textit{var}2=\text{Variance of each of the sample sets}\\&n1\text{ and }n2=\text{Number of records in each sample set}\end{aligned} ​ T-value = n 1 + n 2 − 2 ( n 1 − 1 ) × var 1 2 + ( n 2 − 1 ) × var 2 2 ​ × n 1 1 ​ + n 2 1 ​ ​ mean 1 − mean 2 ​ where: mean 1  and  mean 2 = Average values of each of the sample sets var 1  and  var 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set ​

Degrees of Freedom = n 1 + n 2 − 2 where: n 1  and  n 2 = Number of records in each sample set \begin{aligned} &\text{Degrees of Freedom} = n1 + n2 - 2 \\ &\textbf{where:}\\ &n1 \text{ and } n2 = \text{Number of records in each sample set} \\ \end{aligned} ​ Degrees of Freedom = n 1 + n 2 − 2 where: n 1  and  n 2 = Number of records in each sample set ​

## Unequal Variance T-Test

The unequal variance t-test is an independent t-test and is used when the number of samples in each group is different, and the variance of the two data sets is also different. This test is also called Welch's t-test.

The formula used for calculating t-value and degrees of freedom for an unequal variance t-test is:

T-value = m e a n 1 − m e a n 2 ( v a r 1 n 1 + v a r 2 n 2 ) where: m e a n 1  and  m e a n 2 = Average values of each of the sample sets v a r 1  and  v a r 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set \begin{aligned}&\text{T-value}=\frac{mean1-mean2}{\sqrt{\bigg(\frac{var1}{n1}{+\frac{var2}{n2}\bigg)}}}\\&\textbf{where:}\\&mean1 \text{ and } mean2 = \text{Average values of each} \\&\text{of the sample sets} \\&var1 \text{ and } var2 = \text{Variance of each of the sample sets} \\&n1 \text{ and } n2 = \text{Number of records in each sample set} \end{aligned} ​ T-value = ( n 1 v a r 1 ​ + n 2 v a r 2 ​ ) ​ m e an 1 − m e an 2 ​ where: m e an 1  and  m e an 2 = Average values of each of the sample sets v a r 1  and  v a r 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set ​

Degrees of Freedom = ( v a r 1 2 n 1 + v a r 2 2 n 2 ) 2 ( v a r 1 2 n 1 ) 2 n 1 − 1 + ( v a r 2 2 n 2 ) 2 n 2 − 1 where: v a r 1  and  v a r 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set \begin{aligned} &\text{Degrees of Freedom} = \frac{ \left ( \frac{ var1^2 }{ n1 } + \frac{ var2^2 }{ n2 } \right )^2 }{ \frac{ \left ( \frac{ var1^2 }{ n1 } \right )^2 }{ n1 - 1 } + \frac{ \left ( \frac{ var2^2 }{ n2 } \right )^2 }{ n2 - 1}} \\ &\textbf{where:}\\ &var1 \text{ and } var2 = \text{Variance of each of the sample sets} \\ &n1 \text{ and } n2 = \text{Number of records in each sample set} \\ \end{aligned} ​ Degrees of Freedom = n 1 − 1 ( n 1 v a r 1 2 ​ ) 2 ​ + n 2 − 1 ( n 2 v a r 2 2 ​ ) 2 ​ ( n 1 v a r 1 2 ​ + n 2 v a r 2 2 ​ ) 2 ​ where: v a r 1  and  v a r 2 = Variance of each of the sample sets n 1  and  n 2 = Number of records in each sample set ​

The following flowchart can be used to determine which t-test to use based on the characteristics of the sample sets. The key items to consider include the similarity of the sample records, the number of data records in each sample set, and the variance of each sample set.

Image by Julie Bang Â© Investopedia 2019

## Example of an Unequal Variance T-Test

Assume that the diagonal measurement of paintings received in an art gallery is taken. One group of samples includes 10 paintings, while the other includes 20 paintings. The data sets, with the corresponding mean and variance values, are as follows:

Set 1 Set 2
19.7 28.3
20.4 26.7
19.6 20.1
17.8 23.3
18.5 25.2
18.9 22.1
18.3 17.7
18.9 27.6
19.5 20.6
21.95 13.7
23.2
17.5
20.6
18
23.9
21.6
24.3
20.4
23.9
13.3
19.4 21.6
1.4 17.1

Though the mean of Set 2 is higher than that of Set 1, we cannot conclude that the population corresponding to Set 2 has a higher mean than the population corresponding to Set 1.

Is the difference from 19.4 to 21.6 due to chance alone, or do differences exist in the overall populations of all the paintings received in the art gallery? We establish the problem by assuming the null hypothesis that the mean is the same between the two sample sets and conduct a t-test to test if the hypothesis is plausible.

Since the number of data records is different (n1 = 10 and n2 = 20) and the variance is also different, the t-value and degrees of freedom are computed for the above data set using the formula mentioned in the Unequal Variance T-Test section.

The t-value is -2.24787. Since the minus sign can be ignored when comparing the two t-values, the computed value is 2.24787.

The degrees of freedom value is 24.38 and is reduced to 24, owing to the formula definition requiring rounding down of the value to the least possible integer value.

One can specify a level of probability (alpha level, level of significance,  p ) as a criterion for acceptance. In most cases, a 5% value can be assumed.

Using the degree of freedom value as 24 and a 5% level of significance, a look at the t-value distribution table gives a value of 2.064. Comparing this value against the computed value of 2.247 indicates that the calculated t-value is greater than the table value at a significance level of 5%. Therefore, it is safe to reject the null hypothesis that there is no difference between means. The population set has intrinsic differences, and they are not by chance.

## How Is the T-Distribution Table Used?

The T-Distribution Table is available in one-tail and two-tails formats. The former is used for assessing cases that have a fixed value or range with a clear direction, either positive or negative. For instance, what is the probability of the output value remaining below -3, or getting more than seven when rolling a pair of dice? The latter is used for range-bound analysis, such as asking if the coordinates fall between -2 and +2.

## What Is an Independent T-Test?

The samples of independent t-tests are selected independent of each other where the data sets in the two groups don’t refer to the same values. They may include a group of 100 randomly unrelated patients split into two groups of 50 patients each. One of the groups becomes the control group and is administered a placebo, while the other group receives a prescribed treatment. This constitutes two independent sample groups that are unpaired and unrelated to each other.

## What Does a T-Test Explain and How Are They Used?

A t-test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment has an effect on the population of interest, or whether two groups are different from one another.

• Editorial Policy

Statistics By Jim

Making statistics intuitive

## What is an Independent Samples T Test?

Use an independent samples t test when you want to compare the means of precisely two groups—no more and no less! Typically, you perform this test to determine whether two population means are different. This procedure is an inferential statistical hypothesis test, meaning it uses samples to draw conclusions about populations. The independent samples t test is also known as the two sample t test.

For an example of an independent t test, do students who learn using Method A have a different mean score than those who learn using Method B?

In this post, you’ll learn about the hypotheses, assumptions, and how to interpret the results for independent samples t tests.

Related post : Difference between Descriptive and Inferential Statistics

## Independent Samples T Tests Hypotheses

Independent samples t tests have the following hypotheses:

• Null hypothesis: The means for the two populations are equal.
• Alternative hypothesis : The means for the two populations are not equal.

If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis. The difference between the two means is statistically significant. Your sample provides strong enough evidence to conclude that the two population means are not equal.

Notice how the hypotheses for the two sample t test relate to independent populations. They do not contain the same subjects.

Learn how this analysis compares to the Z Test .

Related posts : How to Interpret P Values and Null Hypothesis: Definition, Rejecting & Examples .

## Independent Samples T Test Assumptions

For reliable independent samples t test results, your data should satisfy the following assumptions:

## You have a random sample

Drawing a random sample from the population you are studying helps ensure that your data represent the population. Representative samples are vital when you want to make inferences about the population. If your data do not represent the population, your analysis results will not be valid for that population.

You must draw a random sample from your population of interest. Each item or person in the population must have an equal probability of being selected.

Related posts : Populations, Parameters, and Samples in Inferential Statistics and Representative Samples: Definition, Uses & Examples .

## Your data must be continuous

T tests require continuous data . Continuous variables can take on any numeric value, and the scale can be meaningfully divided into smaller increments, including fractional and decimal values. There are an infinite number of possible values between any two values. Typically, you measure continuous variables on a scale. For example, when you measure temperature, weight, and height, you have continuous data.

Other hypothesis tests can handle different types of data. For more information, read Comparing Hypothesis Tests for Continuous, Binary, and Count Data .

## Your sample data should follow a normal distribution or each group has more than 15 observations

All t-tests assume that your data follow the normal distribution . However, your group distributions can be skewed if your sample size is large enough thanks to the central limit theorem.

For the independent samples t test, when each group is larger than 15, your data can be mildly skewed and the test results will still be valid. However, if your sample size is less than 15 per group, graph your data and determine whether the two distributions are skewed. In this case, you might need to use a nonparametric test . The Mann Whitney U test is the nonparametric test that corresponds to the independent samples t-test.

Fortunately, if you have more than 15 observations in each group for a two sample t test, you don’t have to worry about the normality assumption too much.

Be sure to check for outliers because they can throw off the results.

Related post : Central Limit Theorem and Skewed Distributions

## The groups are independent

Independent samples contain different sets of items in each sample. Independent samples t tests compare two distinct samples. Hence, it’s a two sample t test. If you have the same people or items in both groups, you can use the paired t-test .

Related post : Independent and Dependent Samples

## Groups can have equal or unequal variances but use the correct form of the test

Variance, and the closely related standard deviation, are measures of variability. Because the two sample t test uses two independent samples, each sample has its own variance. Consequently, the independent samples t test has two methods. One method assumes that the two groups have equal variances while the other does not assume they are equal. The form that does not assume equal variances is known as Welch’s t-test.

When the sample sizes for both groups are roughly equal, and you have a moderate sample size, t-tests are robust to unequal variances. If one group has twice the standard deviation of another group, it’s time to use Welch’s t-test! However, you don’t need to worry about smaller differences.

If you have unequal variances and unequal sample sizes, it’s vital to use the unequal variances version of the two sample t test!

Related post : Standard Deviations

## Example Independent Samples T Test

Let’s run an example independent sample t test! Our hypothetical scenario is that we are comparing scores from two teaching methods. We drew two random samples of students. Students in one group learned using Method A while the other group used Method B. These samples contain entirely separate students.

Now, we want to determine whether the two means are different. Download the CSV file that contains the independent samples t test example data: t-TestExamples .

Here is what the data look like in the datasheet.

Let’s assume that the variances are equal and use the Assuming Equal Variances version.

## Interpreting the Results

Here’s how to read and report the results for an independent samples t test.

The output indicates that the mean for Method A is 71.50 and for Method B it is 84.74. Looking in the Standard Deviation column, we can see that they are not exactly equal, but they are close enough to assume equal variances.

Because the p-value (0.000) for our independent samples t test is less than the standard significance level of 0.05, we can reject the null hypothesis. If the p-value is low, the null must go! Our sample data support the claim that the population means are different. Specifically, Method B’s mean is greater than Method A’s mean. If high scores are better, then Method B is significantly better than Method A.

The two sample t test estimates that the mean difference is -13.24. However, that estimate is based on 30 observations split between the two groups and it is unlikely to equal the population difference. The confidence interval indicates that the mean difference between these two methods for the entire population is likely between -19.89 and -6.59. Learn more about confidence intervals .

The negative values reflect the fact that Method A has a lower mean than Method B (i.e., Method A – Method B < 0). Because the confidence interval excludes zero (no difference), we can conclude that the population means are different.

• T Test Overview
• One-Sample T-Test
• Running T Tests in Excel
• T-Values and T-Distributions

June 15, 2022 at 12:30 am

Hi Jim. Just to say thank you. All I needed to learn was how to interpret “independent t test” results. and after reading this article, I am looking no further. Many thanks.

December 1, 2021 at 11:08 am

Lily, I don’t know if Jim will reply as he posted this in Oct. I am just now reading it too. From my work in education, I would look at combining the three tests (average score or total points) so that each student in each group has one test.

November 28, 2021 at 8:23 am

Hi, thanks for your articles about statistics and I would like to ask you some questions. How many test variables can a T-test analyse? I’ve selected 2 groups of students to test two different teaching methods and collected the results from three exams (Is it means I have 3 dependent variables?) Then I used an independent sample T-test to analyse the data. My research purpose is to find out which teaching method is more effective. Did I use the wrong statistical method? Look forward to your reply.

## Back to blog home

Hypothesis testing explained in 4 parts, yuzheng sun, phd.

As data scientists, Hypothesis Testing is expected to be well understood, but often not in reality. It is mainly because our textbooks blend two schools of thought – p-value and significance testing vs. hypothesis testing – inconsistently.

For example, some questions are not obvious unless you have thought through them before:

Are power or beta dependent on the null hypothesis?

Can we accept the null hypothesis? Why?

How does MDE change with alpha holding beta constant?

Why do we use standard error in Hypothesis Testing but not the standard deviation?

Why can’t we be specific about the alternative hypothesis so we can properly model it?

Why is the fundamental tradeoff of the Hypothesis Testing about mistake vs. discovery, not about alpha vs. beta?

Addressing this problem is not easy. The topic of Hypothesis Testing is convoluted. In this article, there are 10 concepts that we will introduce incrementally, aid you with visualizations, and include intuitive explanations. After this article, you will have clear answers to the questions above that you truly understand on a first-principle level and explain these concepts well to your stakeholders.

Set up the question properly using core statistical concepts, and connect them to Hypothesis Testing, while striking a balance between technically correct and simplicity. Specifically,

We emphasize a clear distinction between the standard deviation and the standard error, and why the latter is used in Hypothesis Testing

We explain fully when can you “accept” a hypothesis, when shall you say “failing to reject” instead of “accept”, and why

Introduce alpha, type I error, and the critical value with the null hypothesis

Introduce beta, type II error, and power with the alternative hypothesis

Introduce minimum detectable effects and the relationship between the factors with power calculations , with a high-level summary and practical recommendations

## Part 1 - Hypothesis Testing, the central limit theorem, population, sample, standard deviation, and standard error

In Hypothesis Testing, we begin with a null hypothesis , which generally asserts that there is no effect between our treatment and control groups. Commonly, this is expressed as the difference in means between the treatment and control groups being zero.

The central limit theorem suggests an important property of this difference in means — given a sufficiently large sample size, the underlying distribution of this difference in means will approximate a normal distribution, regardless of the population's original distribution. There are two notes:

1. The distribution of the population for the treatment and control groups can vary, but the observed means (when you observe many samples and calculate many means) are always normally distributed with a large enough sample. Below is a chart, where the n=10 and n=30 correspond to the underlying distribution of the sample means.

2. Pay attention to “the underlying distribution”. Standard deviation vs. standard error is a potentially confusing concept. Let’s clarify.

## Standard deviation vs. Standard error

Let’s declare our null hypothesis as having no treatment effect. Then, to simplify, let’s propose the following normal distribution with a mean of 0 and a standard deviation of 1 as the range of possible outcomes with probabilities associated with this null hypothesis.

The language around population, sample, group, and estimators can get confusing. Again, to simplify, let’s forget that the null hypothesis is about the mean estimator, and declare that we can either observe the mean hypothesis once or many times. When we observe it many times, it forms a sample*, and our goal is to make decisions based on this sample.

* For technical folks, the observation is actually about a single sample, many samples are a group, and the difference in groups is the distribution we are talking about as the mean hypothesis. The red curve represents the distribution of the estimator of this difference, and then we can have another sample consisting of many observations of this estimator. In my simplified language, the red curve is the distribution of the estimator, and the blue curve with sample size is the repeated observations of it. If you have a better way to express these concepts without causing confusiongs, please suggest.

This probability density function means if there is one realization from this distribution, the realitization can be anywhere on the x-axis, with the relative likelihood on the y-axis.

If we draw multiple observations , they form a sample . Each observation in this sample follows the property of this underlying distribution – more likely to be close to 0, and equally likely to be on either side, which makes the odds of positive and negative cancel each other out, so the mean of this sample is even more centered around 0.

We use the standard error to represent the error of our “sample mean” .

The standard error = the standard deviation of the observed sample / sqrt (sample size).

For a sample size of 30, the standard error is roughly 0.18. Compared with the underlying distribution, the distribution of the sample mean is much narrower.

In Hypothesis Testing, we try to draw some conclusions – is there a treatment effect or not? – based on a sample. So when we talk about alpha and beta, which are the probabilities of type I and type II errors , we are talking about the probabilities based on the plot of sample means and standard error .

## Part 2, The null hypothesis: alpha and the critical value

From Part 1, we stated that a null hypothesis is commonly expressed as the difference in means between the treatment and control groups being zero.

Without loss of generality*, let’s assume the underlying distribution of our null hypothesis is mean 0 and standard deviation 1

Then the sample mean of the null hypothesis is 0 and the standard error of 1/√ n, where n is the sample size.

When the sample size is 30, this distribution has a standard error of ≈0.18 looks like the below.

*: A note for the technical readers: The null hypothesis is about the difference in means, but here, without complicating things, we made the subtle change to just draw the distribution of this “estimator of this difference in means”. Everything below speaks to this “estimator”.

The reason we have the null hypothesis is that we want to make judgments, particularly whether a  treatment effect exists. But in the world of probabilities, any observation, and any sample mean can happen, with different probabilities. So we need a decision rule to help us quantify our risk of making mistakes.

The decision rule is, let’s set a threshold. When the sample mean is above the threshold, we reject the null hypothesis; when the sample mean is below the threshold, we accept the null hypothesis.

## Accepting a hypothesis vs. failing to reject a hypothesis

It’s worth noting that you may have heard of “we never accept a hypothesis, we just fail to reject a hypothesis” and be subconsciously confused by it. The deep reason is that modern textbooks do an inconsistent blend of Fisher’s significance testing and Neyman-Pearson’s Hypothesis Testing definitions and ignore important caveats ( ref ). To clarify:

First of all, we can never “prove” a particular hypothesis given any observations, because there are infinitely many true hypotheses (with different probabilities) given an observation. We will visualize it in Part 3.

Second, “accepting” a hypothesis does not mean that you believe in it, but only that you act as if it were true. So technically, there is no problem with “accepting” a hypothesis.

But, third, when we talk about p-values and confidence intervals, “accepting” the null hypothesis is at best confusing. The reason is that “the p-value above the threshold” just means we failed to reject the null hypothesis. In the strict Fisher’s p-value framework, there is no alternative hypothesis. While we have a clear criterion for rejecting the null hypothesis (p < alpha), we don't have a similar clear-cut criterion for "accepting" the null hypothesis based on beta.

So the dangers in calling “accepting a hypothesis” in the p-value setting are:

Many people misinterpret “accepting” the null hypothesis as “proving” the null hypothesis, which is wrong;

“Accepting the null hypothesis” is not rigorously defined, and doesn’t speak to the purpose of the test, which is about whether or not we reject the null hypothesis.

In this article, we will stay consistent within the Neyman-Pearson framework , where “accepting” a hypothesis is legal and necessary. Otherwise, we cannot draw any distributions without acting as if some hypothesis was true.

You don’t need to know the name Neyman-Pearson to understand anything, but pay attention to our language, as we choose our words very carefully to avoid mistakes and confusion.

So far, we have constructed a simple world of one hypothesis as the only truth, and a decision rule with two potential outcomes – one of the outcomes is “reject the null hypothesis when it is true” and the other outcome is “accept the null hypothesis when it is true”. The likelihoods of both outcomes come from the distribution where the null hypothesis is true.

Later, when we introduce the alternative hypothesis and MDE, we will gradually walk into the world of infinitely many alternative hypotheses and visualize why we cannot “prove” a hypothesis.

We save the distinction between the p-value/significance framework vs. Hypothesis Testing in another article where you will have the full picture.

## Type I error, alpha, and the critical value

We’re able to construct a distribution of the sample mean for this null hypothesis using the standard error. Since we only have the null hypothesis as the truth of our universe, we can only make one type of mistake – falsely rejecting the null hypothesis when it is true. This is the type I error , and the probability is called alpha . Suppose we want alpha to be 5%. We can calculate the threshold required to make it happen. This threshold is called the critical value . Below is the chart we further constructed with our sample of 30.

In this chart, alpha is the blue area under the curve. The critical value is 0.3. If our sample mean is above 0.3, we reject the null hypothesis. We have a 5% chance of making the type I error.

Type I error: Falsely rejecting the null hypothesis when the null hypothesis is true

Alpha: The probability of making a Type I error

Critical value: The threshold to determine whether the null hypothesis is to be rejected or not

## Part 3, The alternative hypothesis: beta and power

You may have noticed in part 2 that we only spoke to Type I error – rejecting the null hypothesis when it is true. What about the Type II error – falsely accepting the null hypothesis when it is not true?

But it is weird to call “accepting” false unless we know the truth. So we need an alternative hypothesis which serves as the alternative truth.

## Alternative hypotheses are theoretical constructs

There is an important concept that most textbooks fail to emphasize – that is, you can have infinitely many alternative hypotheses for a given null hypothesis, we just choose one. None of them are more special or “real” than the others.

Let’s visualize it with an example. Suppose we observed a sample mean of 0.51, what is the true alternative hypothesis?

With this visualization, you can see why we have “infinitely many alternative hypotheses” because, given the observation, there is an infinite number of alternative hypotheses (plus the null hypothesis) that can be true, each with different probabilities. Some are more likely than others, but all are possible.

Remember, alternative hypotheses are a theoretical construct. We choose one particular alternative hypothesis to calculate certain probabilities. By now, we should have more understanding of why we cannot “accept” the null hypothesis given an observation. We can’t prove that the null hypothesis is true, we just fail to accept it given the observation and our pre-determined decision rule.

We will fully reconcile this idea of picking one alternative hypothesis out of the world of infinite possibilities when we talk about MDE. The idea of “accept” vs. “fail to reject” is deeper, and we won’t cover it fully in this article. We will do so when we have an article about the p-value and the confidence interval.

## Type II error and Beta

For the sake of simplicity and easy comparison, let’s choose an alternative hypothesis with a mean of 0.5, and a standard deviation of

1. Again, with a sample size of 30, the standard error ≈0.18. There are now two potential “truths” in our simple universe.

Remember from the null hypothesis, we want alpha to be 5% so the corresponding critical value is 0.30. We modify our rule as follows:

If the observation is above 0.30, we reject the null hypothesis and accept the alternative hypothesis ;

If the observation is below 0.30, we accept the null hypothesis and reject the alternative hypothesis .

With the introduction of the alternative hypothesis, the alternative “(hypothesized) truth”, we can call “accepting the null hypothesis and rejecting the alternative hypothesis” a mistake – the Type II error. We can also calculate the probability of this mistake. This is called beta, which is illustrated by the red area below.

From the visualization, we can see that beta is conditional on the alternative hypothesis and the critical value. Let’s elaborate on these two relationships one by one, very explicitly, as both of them are important.

First, Let’s visualize how beta changes with the mean of the alternative hypothesis by setting another alternative hypothesis where mean = 1 instead of 0.5

Beta change from 13.7% to 0.0%. Namely, beta is the probability of falsely rejecting a particular alternative hypothesis when we assume it is true. When we assume a different alternative hypothesis is true, we get a different beta. So strictly speaking, beta only speaks to the probability of falsely rejecting a particular alternative hypothesis when it is true . Nothing else. It’s only under other conditions, that “rejecting the alternative hypothesis” implies “accepting” the null hypothesis or “failing to accept the null hypothesis”. We will further elaborate when we talk about p-value and confidence interval in another article. But what we talked about so far is true and enough for understanding power.

Second, there is a relationship between alpha and beta. Namely, given the null hypothesis and the alternative hypothesis, alpha would determine the critical value, and the critical value determines beta. This speaks to the tradeoff between mistake and discovery.

If we tolerate more alpha, we will have a smaller critical value, and for the same beta, we can detect a smaller alternative hypothesis

If we tolerate more beta, we can also detect a smaller alternative hypothesis.

In short, if we tolerate more mistakes (either Type I or Type II), we can detect a smaller true effect. Mistake vs. discovery is the fundamental tradeoff of Hypothesis Testing.

So tolerating more mistakes leads to more chance of discovery. This is the concept of MDE that we will elaborate on in part 4.

Finally, we’re ready to define power. Power is an important and fundamental topic in statistical testing, and we’ll explain the concept in three different ways.

## Three ways to understand power

First, the technical definition of power is 1−β. It represents that given an alternative hypothesis and given our null, sample size, and decision rule (alpha = 0.05), the probability is that we accept this particular hypothesis. We visualize the yellow area below.

Second, power is really intuitive in its definition. A real-world example is trying to determine the most popular car manufacturer in the world. If I observe one car and see one brand, my observation is not very powerful. But if I observe a million cars, my observation is very powerful. Powerful tests mean that I have a high chance of detecting a true effect.

Third, to illustrate the two concepts concisely, let’s run a visualization by just changing the sample size from 30 to 100 and see how power increases from 86.3% to almost 100%.

As the graph shows, we can easily see that power increases with sample size . The reason is that the distribution of both the null hypothesis and the alternative hypothesis became narrower as their sample means got more accurate. We are less likely to make either a type I error (which reduces the critical value) or a type II error.

Type II error: Failing to reject the null hypothesis when the alternative hypothesis is true

Beta: The probability of making a type II error

Power: The ability of the test to detect a true effect when it’s there

## Part 4, Power calculation: MDE

The relationship between mde, alternative hypothesis, and power.

Now, we are ready to tackle the most nuanced definition of them all: Minimum detectable effect (MDE). First, let’s make the sample mean of the alternative hypothesis explicit on the graph with a red dotted line.

What if we keep the same sample size, but want power to be 80%? This is when we recall the previous chapter that “alternative hypotheses are theoretical constructs”. We can have a different alternative that corresponds to 80% power. After some calculations, we discovered that when it’s the alternative hypothesis with mean = 0.45 (if we keep the standard deviation to be 1).

This is where we reconcile the concept of “infinitely many alternative hypotheses” with the concept of minimum detectable delta. Remember that in statistical testing, we want more power. The “ minimum ” in the “ minimum detectable effect”, is the minimum value of the mean of the alternative hypothesis that would give us 80% power. Any alternative hypothesis with a mean to the right of MDE gives us sufficient power.

In other words, there are indeed infinitely many alternative hypotheses to the right of this mean 0.45. The particular alternative hypothesis with a mean of 0.45 gives us the minimum value where power is sufficient. We call it the minimum detectable effect, or MDE.

## The complete definition of MDE from scratch

Let’s go through how we derived MDE from the beginning:

We fixed the distribution of sample means of the null hypothesis, and fixed sample size, so we can draw the blue distribution

For our decision rule, we require alpha to be 5%. We derived that the critical value shall be 0.30 to make 5% alpha happen

We fixed the alternative hypothesis to be normally distributed with a standard deviation of 1 so the standard error is 0.18, the mean can be anywhere as there are infinitely many alternative hypotheses

For our decision rule, we require beta to be 20% or less, so our power is 80% or more.

We derived that the minimum value of the observed mean of the alternative hypothesis that we can detect with our decision rule is 0.45. Any value above 0.45 would give us sufficient power.

## How MDE changes with sample size

Now, let’s tie everything together by increasing the sample size, holding alpha and beta constant, and see how MDE changes.

Narrower distribution of the sample mean + holding alpha constant -> smaller critical value from 0.3 to 0.16

+ holding beta constant -> MDE decreases from 0.45 to 0.25

This is the other key takeaway:  The larger the sample size, the smaller of an effect we can detect, and the smaller the MDE.

This is a critical takeaway for statistical testing. It suggests that even for companies not with large sample sizes if their treatment effects are large, AB testing can reliably detect it.

## Summary of Hypothesis Testing

Let’s review all the concepts together.

Assuming the null hypothesis is correct:

Alpha: When the null hypothesis is true, the probability of rejecting it

Critical value: The threshold to determine rejecting vs. accepting the null hypothesis

Assuming an alternative hypothesis is correct:

Beta: When the alternative hypothesis is true, the probability of rejecting it

Power: The chance that a real effect will produce significant results

Power calculation:

Minimum detectable effect (MDE): Given sample sizes and distributions, the minimum mean of alternative distribution that would give us the desired alpha and sufficient power (usually alpha = 0.05 and power >= 0.8)

Relationship among the factors, all else equal: Larger sample, more power; Larger sample, smaller MDE

Everything we talk about is under the Neyman-Pearson framework. There is no need to mention the p-value and significance under this framework. Blending the two frameworks is the inconsistency brought by our textbooks. Clarifying the inconsistency and correctly blending them are topics for another day.

## Practical recommendations

That’s it. But it’s only the beginning. In practice, there are many crafts in using power well, for example:

Why peeking introduces a behavior bias, and how to use sequential testing to correct it

Why having multiple comparisons affects alpha, and how to use Bonferroni correction

The relationship between sample size, duration of the experiment, and allocation of the experiment?

Treat your allocation as a resource for experimentation, understand when interaction effects are okay, and when they are not okay, and how to use layers to manage

Practical considerations for setting an MDE

Also, in the above examples, we fixed the distribution, but in reality, the variance of the distribution plays an important role. There are different ways of calculating the variance and different ways to reduce variance, such as CUPED, or stratified sampling.

Related resources:

How to calculate power with an uneven split of sample size: https://blog.statsig.com/calculating-sample-sizes-for-a-b-tests-7854d56c2646

Real-life applications: https://blog.statsig.com/you-dont-need-large-sample-sizes-to-run-a-b-tests-6044823e9992

2m events per month, free forever..

## Build fast?

Try statsig today.

## Recent Posts

Top 8 common experimentation mistakes and how to fix them.

I discussed 8 A/B testing mistakes with Allon Korem (Bell Statistics) and Tyler VanHaren (Statsig). Learn fixes to improve accuracy and drive better business outcomes.

## Introducing Differential Impact Detection

Introducing Differential Impact Detection: Identify how different user groups respond to treatments and gain useful insights from varied experiment results.

## Identifying and experimenting with Power Users using Statsig

Identify power users to drive growth and engagement. Learn to pinpoint and leverage these key players with targeted experiments for maximum impact.

## How to Ingest Data Into Statsig

Simplify data pipelines with Statsig. Use SDKs, third-party integrations, and Data Warehouse Native Solution for effortless data ingestion at any stage.

## A/B Testing performance wins on NestJS API servers

Learn how we use Statsig to enhance our NestJS API servers, reducing request processing time and CPU usage through performance experiments.

## An overview of making early decisions on experiments

Learn the risks vs. rewards of making early decisions in experiments and Statsig's techniques to reduce experimentation times and deliver trustworthy results.

## User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

• Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
• Duis aute irure dolor in reprehenderit in voluptate
• Excepteur sint occaecat cupidatat non proident

## Keyboard Shortcuts

10.2 - t-test: when population variance is unknown.

Now that, for purely pedagogical reasons, we have the unrealistic situation (of a known population variance) behind us, let's turn our attention to the realistic situation in which both the population mean and population variance are unknown.

## Example 10-2 Section

It is assumed that the mean systolic blood pressure is $$\mu$$ = 120 mm Hg. In the Honolulu Heart Study, a sample of $$n=100$$ people had an average systolic blood pressure of 130.1 mm Hg with a standard deviation of 21.21 mm Hg. Is the group significantly different (with respect to systolic blood pressure!) from the regular population?

The null hypothesis is $$H_0:\mu=120$$, and because there is no specific direction implied, the alternative hypothesis is $$H_A:\mu\ne 120$$. In general, we know that if the data are normally distributed, then:

$$T=\dfrac{\bar{X}-\mu}{S/\sqrt{n}}$$

follows a $$t$$-distribution with $$n-1$$ degrees of freedom. Therefore, it seems reasonable to use the test statistic:

$$T=\dfrac{\bar{X}-\mu_0}{S/\sqrt{n}}$$

for testing the null hypothesis $$H_0:\mu=\mu_0$$ against any of the possible alternative hypotheses $$H_A:\mu \neq \mu_0$$, $$H_A:\mu<\mu_0$$, and $$H_A:\mu>\mu_0$$. For the example in hand, the value of the test statistic is:

$$t=\dfrac{130.1-120}{21.21/\sqrt{100}}=4.762$$

The critical region approach tells us to reject the null hypothesis at the $$\alpha=0.05$$ level if $$t\ge t_{0.025, 99}=1.9842$$ or if $$t\le t_{0.025, 99}=-1.9842$$. Therefore, we reject the null hypothesis because $$t=4.762>1.9842$$, and therefore falls in the rejection region:

Again, as always, we draw the same conclusion by using the $$p$$-value approach. The $$p$$-value approach tells us to reject the null hypothesis at the $$\alpha=0.05$$ level if the $$p$$-value $$\le \alpha=0.05$$. In this case, the $$p$$-value is $$2 \times P(T_{99}>4.762)<2\times P(T_{99}>1.9842)=2(0.025)=0.05$$:

As expected, we reject the null hypothesis because $$p$$-value $$\le 0.01<\alpha=0.05$$.

Again, we'll learn how to ask Minitab to conduct the t -test for a mean $$\mu$$ in a bit, but this is what the Minitab output for this example looks like:

Test of mu = 120 vs not = 120
N Mean StDev SE Mean 95% CI T P
100 130.100 21.210 2.121 (125.891, 134.309) 4.76 0.000

By the way, the decision to reject the null hypothesis is consistent with the one you would make using a 95% confidence interval. Using the data, a 95% confidence interval for the mean $$\mu$$ is:

$$\bar{x}\pm t_{0.025,99}\left(\dfrac{s}{\sqrt{n}}\right)=130.1 \pm 1.9842\left(\dfrac{21.21}{\sqrt{100}}\right)$$

which simplifies to $$130.1\pm 4.21$$. That is, we can be 95% confident that the mean systolic blood pressure of the Honolulu population is between 125.89 and 134.31 mm Hg. How can a population living in a climate with consistently sunny 80 degree days have elevated blood pressure?!

Anyway, the critical region approach for the $$\alpha=0.05$$ hypothesis test tells us to reject the null hypothesis that $$\mu=120$$:

if $$t=\dfrac{\bar{x}-\mu_0}{s/\sqrt{n}}\geq 1.9842$$ or if $$t=\dfrac{\bar{x}-\mu_0}{s/\sqrt{n}}\leq -1.9842$$

which is equivalent to rejecting:

if $$\bar{x}-\mu_0 \geq 1.9842\left(\dfrac{s}{\sqrt{n}}\right)$$ or if $$\bar{x}-\mu_0 \leq -1.9842\left(\dfrac{s}{\sqrt{n}}\right)$$

if $$\mu_0 \leq \bar{x}-1.9842\left(\dfrac{s}{\sqrt{n}}\right)$$ or if $$\mu_0 \geq \bar{x}+1.9842\left(\dfrac{s}{\sqrt{n}}\right)$$

which, upon inserting the data for this particular example, is equivalent to rejecting:

if $$\mu_0 \leq 125.89$$ or if $$\mu_0 \geq 134.31$$

which just happen to be (!) the endpoints of the 95% confidence interval for the mean. Indeed, the results are consistent!

## Two Sample t-test: Definition, Formula, and Example

A two sample t-test is used to determine whether or not two population means are equal.

This tutorial explains the following:

• The motivation for performing a two sample t-test.
• The formula to perform a two sample t-test.
• The assumptions that should be met to perform a two sample t-test.
• An example of how to perform a two sample t-test.

## Two Sample t-test: Motivation

Suppose we want to know whether or not the mean weight between two different species of turtles is equal. Since there are thousands of turtles in each population, it would be too time-consuming and costly to go around and weigh each individual turtle.

Instead, we might take a simple random sample of 15 turtles from each population and use the mean weight in each sample to determine if the mean weight is equal between the two populations:

However, it’s virtually guaranteed that the mean weight between the two samples will be at least a little different. The question is whether or not this difference is statistically significant . Fortunately, a two sample t-test allows us to answer this question.

## Two Sample t-test: Formula

A two-sample t-test always uses the following null hypothesis:

• H 0 : μ 1  = μ 2 (the two population means are equal)

The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

• H 1 (two-tailed): μ 1  ≠ μ 2 (the two population means are not equal)
• H 1 (left-tailed): μ 1  < μ 2  (population 1 mean is less than population 2 mean)
• H 1 (right-tailed):  μ 1 > μ 2  (population 1 mean is greater than population 2 mean)

We use the following formula to calculate the test statistic t:

Test statistic:  ( x 1  –  x 2 )  /  s p (√ 1/n 1  + 1/n 2 )

where  x 1  and  x 2 are the sample means, n 1 and n 2  are the sample sizes, and where s p is calculated as:

s p = √  (n 1 -1)s 1 2  +  (n 2 -1)s 2 2  /  (n 1 +n 2 -2)

where s 1 2  and s 2 2  are the sample variances.

If the p-value that corresponds to the test statistic t with (n 1 +n 2 -1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis.

## Two Sample t-test: Assumptions

For the results of a two sample t-test to be valid, the following assumptions should be met:

• The observations in one sample should be independent of the observations in the other sample.
• The data should be approximately normally distributed.
• The two samples should have approximately the same variance. If this assumption is not met, you should instead perform Welch’s t-test .
• The data in both samples was obtained using a random sampling method .

## Two Sample t-test : Example

Suppose we want to know whether or not the mean weight between two different species of turtles is equal. To test this, will perform a two sample t-test at significance level α = 0.05 using the following steps:

Step 1: Gather the sample data.

Suppose we collect a random sample of turtles from each population with the following information:

• Sample size n 1 = 40
• Sample mean weight  x 1  = 300
• Sample standard deviation s 1 = 18.5
• Sample size n 2 = 38
• Sample mean weight  x 2  = 305
• Sample standard deviation s 2 = 16.7

Step 2: Define the hypotheses.

We will perform the two sample t-test with the following hypotheses:

• H 0 :  μ 1  = μ 2 (the two population means are equal)
• H 1 :  μ 1  ≠ μ 2 (the two population means are not equal)

Step 3: Calculate the test statistic  t .

First, we will calculate the pooled standard deviation s p :

s p = √  (n 1 -1)s 1 2  +  (n 2 -1)s 2 2  /  (n 1 +n 2 -2)  = √  (40-1)18.5 2  +  (38-1)16.7 2  /  (40+38-2)  = 17.647

Next, we will calculate the test statistic  t :

t = ( x 1  –  x 2 )  /  s p (√ 1/n 1  + 1/n 2 ) =  (300-305) / 17.647(√ 1/40 + 1/38 ) =  -1.2508

Step 4: Calculate the p-value of the test statistic  t .

According to the T Score to P Value Calculator , the p-value associated with t = -1.2508 and degrees of freedom = n 1 +n 2 -2 = 40+38-2 = 76 is  0.21484 .

Step 5: Draw a conclusion.

Since this p-value is not less than our significance level α = 0.05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that the mean weight of turtles between these two populations is different.

Note:  You can also perform this entire two sample t-test by simply using the Two Sample t-test Calculator .

The following tutorials explain how to perform a two-sample t-test using different statistical programs:

How to Perform a Two Sample t-test in Excel How to Perform a Two Sample t-test in SPSS How to Perform a Two Sample t-test in Stata How to Perform a Two Sample t-test in R How to Perform a Two Sample t-test in Python How to Perform a Two Sample t-test on a TI-84 Calculator

## Featured Posts

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

## 2 Replies to “Two Sample t-test: Definition, Formula, and Example”

I like the detailed information and simplified in the way I can understand and relate easily. Thank you

It seems a couple of parenthesis is missed at the pooled standard deviation formula. Under square root you have (n1-1)s12 + (n2-1)s22 / (n1+n2-2) but it should be [(n1-1)s12 + (n2-1)s22] / (n1+n2-2) I used square bracket

## Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

• Data Science
• Data Analysis
• Data Visualization
• Machine Learning
• Deep Learning
• Computer Vision
• Artificial Intelligence
• AI ML DS Interview Series
• AI ML DS Projects series
• Data Engineering
• Web Scrapping
• Machine Learning Mathematics

## Linear Algebra and Matrix

• Scalar and Vector
• Python Program to Add Two Matrices
• Python program to multiply two matrices
• Vector Operations
• Product of Vectors
• Scalar Product of Vectors
• Dot and Cross Products on Vectors
• Transpose a matrix in Single line in Python
• Transpose of a Matrix
• Adjoint and Inverse of a Matrix
• How to inverse a matrix using NumPy
• Determinant of a Matrix
• Program to find Normal and Trace of a matrix
• Data Science | Solving Linear Equations
• Data Science - Solving Linear Equations with Python
• System of Linear Equations
• System of Linear Equations in three variables using Cramer's Rule
• Eigenvalues and Eigenvectors
• Applications of Eigenvalues and Eigenvectors
• How to compute the eigenvalues and right eigenvectors of a given square array using NumPY?

## Statistics for Machine Learning

• Descriptive Statistic
• Measures of Central Tendency
• Measures of Dispersion | Types, Formula and Examples
• Mean, Variance and Standard Deviation
• Calculate the average, variance and standard deviation in Python using NumPy
• Random Variable
• Difference between Parametric and Non-Parametric Methods
• Probability Distribution - Function, Formula, Table
• Confidence Interval
• Covariance and Correlation
• Program to find correlation coefficient
• Robust Correlation
• Normal Probability Plot
• Quantile Quantile plots
• True Error vs Sample Error
• Bias-Variance Trade Off - Machine Learning
• Understanding Hypothesis Testing
• Paired T-Test - A Detailed Overview
• P-value in Machine Learning
• F-Test in Statistics
• Z-test : Formula, Types, Examples
• Residual Leverage Plot (Regression Diagnostic)
• Difference between Null and Alternate Hypothesis
• Mann and Whitney U test
• Wilcoxon Signed Rank Test
• Kruskal Wallis Test
• Friedman Test
• Mathematics | Probability with Solved Examples

## Probability and Probability Distributions

• Mathematics - Law of Total Probability
• Bayes's Theorem for Conditional Probability
• Mathematics | Probability Distributions Set 1 (Uniform Distribution)
• Mathematics | Probability Distributions Set 4 (Binomial Distribution)
• Mathematics | Probability Distributions Set 5 (Poisson Distribution)
• Uniform Distribution Formula
• Mathematics | Probability Distributions Set 2 (Exponential Distribution)
• Mathematics | Probability Distributions Set 3 (Normal Distribution)
• Mathematics | Beta Distribution Model
• Gamma Distribution Model in Mathematics
• Chi-Square Test for Feature Selection - Mathematical Explanation
• Student's t-distribution in Statistics
• Python - Central Limit Theorem
• Limits, Continuity and Differentiability
• Implicit Differentiation

## Calculus for Machine Learning

• Partial Derivatives in Engineering Mathematics
• How to find Gradient of a Function using Python?
• Optimization techniques for Gradient Descent
• Higher Order Derivatives
• Taylor Series
• Application of Derivative - Maxima and Minima | Mathematics
• Absolute Minima and Maxima
• Optimization for Data Science
• Unconstrained Multivariate Optimization
• Lagrange Multipliers
• Lagrange's Interpolation
• Linear Regression in Machine learning
• Ordinary Least Squares (OLS) using statsmodels

## Regression in Machine Learning

In statistics, various tests are used to compare different samples or groups and draw conclusions about populations. These tests, known as statistical tests, focus on analyzing the likelihood or probability of obtaining the observed data under specific assumptions or hypotheses. They provide a framework for assessing evidence in support of or against a particular hypothesis.

A statistical test begins by formulating a null hypothesis (H 0 ) and an alternative hypothesis (H a ). The null hypothesis represents the default assumption, typically stating no effect or no difference, while the alternative hypothesis suggests a specific relationship or effect.

There are different statistical tests like Z-test , T-test, Chi-squared tests , ANOVA , Z-test , and F-test , etc. which are used to compute the p-value. In this article, we will learn about the T-test.

Table of Content

## What is T-Test?

Assumptions in t-test, prerequisites for t-test, types of t-tests, one sample t-test, independent sample t-test, paired two-sample t-test, frequently asked questions on t-test.

The t-test is named after William Sealy Gosset’s Student’s t-distribution, created while he was writing under the pen name “Student.”

A t-test is a type of inferential statistic test used to determine if there is a significant difference between the means of two groups. It is often used when data is normally distributed and population variance is unknown.

The t-test is used in hypothesis testing to assess whether the observed difference between the means of the two groups is statistically significant or just due to random variation.

• Independence : The observations within each group must be independent of each other. This means that the value of one observation should not influence the value of another observation. Violations of independence can occur with repeated measures, paired data, or clustered data.
• Normality : The data within each group should be approximately normally distributed i.e the distribution of the data within each group being compared should resemble a normal (bell-shaped) distribution. This assumption is crucial for small sample sizes (n < 30).
• Homogeneity of Variances (for independent samples t-test) : The variances of the two groups being compared should be equal. This assumption ensures that the groups have a similar spread of values. Unequal variances can affect the standard error of the difference between means and, consequently, the t-statistic.
• Absence of Outliers: There should be no extreme outliers in the data as outliers can disproportionately influence the results, especially when sample sizes are small.

Let’s quickly review some related terms before digging deeper into the specifics of the t-test.

A t-test is a statistical method used to compare the means of two groups to determine if there is a significant difference between them. The t-test is a parametric test, meaning it makes certain assumptions about the data. Here are the key prerequisites for conducting a t-test.

## Hypothesis Testing :

Hypothesis testing is a statistical method used to make inferences about a population based on a sample of data.

The p-value is the probability of observing a test statistic (or something more extreme) given that the null hypothesis is true.

• A small p-value (typically less than the chosen significance level) suggests that the observed data is unlikely to have occurred by random chance alone, leading to the rejection of the null hypothesis.
• A large p-value suggests that the observed data is likely to have occurred by random chance, and there is not enough evidence to reject the null hypothesis.

## Significance Level :

The significance level is the predetermined threshold that is used to decide whether to reject the null hypothesis. Commonly used significance levels are 0.05, 0.01, or 0.10. A significance level of 0.05 indicates that the researcher is willing to accept a 5% chance of making a Type I error (incorrectly rejecting a true null hypothesis).

## T-statistic :

The t-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score.

• If the t-value is large => the two groups belong to different groups.
• If the t-value is small => the two groups belong to the same group.

## T-Distribution

The t-distribution , commonly known as the Student’s t-distribution, is a probability distribution with tails that are thicker than those of the normal distribution.

## Statistical Significance

Statistical significance is determined by comparing the p-value to the chosen significance level.

• If the p-value is less than or equal to the significance level, the result is considered statistically significant, and the null hypothesis is rejected.
• If the p-value is greater than the significance level, the result is not statistically significant, and there is insufficient evidence to reject the null hypothesis.

In the context of a t-test, these concepts are applied to compare means between two groups. The t-test assesses whether the means are significantly different from each other, taking into account the variability within the groups. The p-value from the t-test is then compared to the significance level to make a decision about the null hypothesis.

A t-table, or a t-distribution table, is a reference table that provides critical values for the t-test. The table is organized by degrees of freedom and significance levels (usually 0.05 or 0.01). The t-table is used to find the critical t-value corresponding to their specific degrees of freedom and chosen significance level. If the calculated t-value is greater than the critical value from the table, it suggests that the observed difference is statistically significant.

There are three types of t-tests, and they are categorized as dependent and independent t-tests.

• One sample t-test test: The mean of a single group against a known mean.
• Independent samples t-test: compares the means for two groups.
• Paired sample t-test: compares means from the same group at different times (say, one year apart).

One sample t-test is one of the widely used t-tests for comparison of the sample mean of the data to a particularly given value. Used for comparing the sample mean to the true/population mean.

We can use this when the sample size is small. (under 30) data is collected randomly and it is approximately normally distributed. It can be calculated as:

• t = t-value
• x_bar = sample mean
• μ = true/population mean
• σ = standard deviation
• n = sample size

Example Problem

Consider the following example. The weights of 25 obese people were taken before enrolling them into the nutrition camp. The population mean weight is found to be 45 kg before starting the camp. After finishing the camp, for the same 25 people, the sample mean was found to be 75 with a standard deviation of 25. Did the fitness camp work?

## One-Sample T-test in Python

The T-value of 6.0 is significantly greater than the critical t-value, leading to rejection of the null hypothesis therefore, we can conclude there is a significant difference in weight before and after the fitness camp. The fitness camp had an effect on the weights of the participants.

The results strongly suggest that the fitness camp was effective in producing a statistically significant change in weight for the participants.

• The T-value and p-value both provide consistent evidence for rejecting the null hypothesis.
• The practical significance should also be considered to understand the real-world impact of this weight change.

An Independent sample t-test, commonly known as an unpaired sample t-test is used to find out if the differences found between two groups is actually significant or just a random occurrence.

We can use this when:

• the population mean or standard deviation is unknown. (information about the population is unknown)
• the two samples are separate/independent. For eg. boys and girls (the two are independent of each other)

It can be calculated using:

Researchers are investigating whether there is a significant difference in the exam scores of two different teaching methods, A and B. Two independent samples, each representing a different teaching method, have been collected. The objective is to determine if there is enough evidence to suggest that one teaching method leads to higher exam scores compared to the other. Suppose, two independent sample data A and B are given, with the following values. We have to perform the Independent samples t-test for this data.

## Two-Sample t-test in Python (Independent)

With T-Value, of 0.989 is less than the critical t-value of 2.1009. Therefore, No significant difference is found between the exam scores of Teaching Method A and Teaching Method B based on the T-value.

With P-Value, of 0.336 is greater than the significance level of 0.05. There is no evidence to reject the null hypothesis, indicating no significant difference between the two teaching methods based on the P-value.

In conclusion, The results suggest that, statistically, there is no significant difference in exam scores between Teaching Method A and Teaching Method B. Therefore, based on this analysis, there is no clear evidence to suggest that one teaching method leads to higher exam scores compared to the other.

Paired sample t-test, commonly known as dependent sample t-test is used to find out if the difference in the mean of two samples is 0. The test is done on dependent samples, usually focusing on a particular group of people or things. In this, each entity is measured twice, resulting in a pair of observations.

We can use this when :

• Two similar (twin like) samples are given. [Eg, Scores obtained in English and Math (both subjects)]
• The dependent variable (data) is continuous.
• The observations are independent of one another.
• The dependent variable is approximately normally distributed.

It can be calculated using,

• (s_d) is the standard deviation of the differences.
• (n) is the number of paired observations.

Consider the following example. Scores (out of 25) of the subjects Math1 and Math2 are taken for a sample of 10 students. We have to perform the paired sample t-test for this data.

## Paired Two-Sample T-test in Python

The paired sample t-test suggests that there is a statistically significant difference in scores between Math1 and Math2 as T-value of -4.95 is less than the critical t-value of -2.2622 and P-value of 0.00079 is less than the significance level of 0.05. Therefore, based on this analysis, it can be concluded that there is evidence to support the claim that the two sets of scores are different, and the difference is not due to random chance.

The above-discussed types of t-tests are widely used in the fields of research in hospitals by experts to gain important information about the medical data given to them about the effects of various medicines and drugs on the population and help them draw out important inferences regarding the same. However, it is the responsibility of the person to see to it that which t-test would bring out the best results and that all the assumptions of that t-test are adhered to. For any doubt/query, comment below.

In conclusion, t-test, play a crucial role in hypothesis testing, comparing means, and drawing conclusions about populations. The test can be one-sample, independent two-sample, or paired two-sample, each with specific use cases and assumptions. Interpretation of results involves considering T-values, P-values, and critical values.

These tests aid researchers in making informed decisions based on statistical evidence.

## Q. What is the t-test for mean in Python?

The t-test for mean in Python is a statistical method used to determine if there is a significant difference between the means of two groups.

## Q. What is the t-test function?

The t-test function is a statistical tool used to compare means and assess the significance of differences between groups, considering factors like sample size and variability.

## Q. What is the p-value in t-test Python?

The p-value in a t-test Python indicates the probability of observing the data or more extreme results assuming the null hypothesis is true. A small p-value suggests evidence against the null hypothesis.

## Q. Why is it called t-test?

The t-test is named after William Sealy Gosset, who published under the pseudonym “Student.” The name “t” refers to the t-distribution used in the test, particularly applicable for small sample sizes.

• Mathematical

## What kind of Experience do you want to share?

#### IMAGES

1. One Sample T Test

2. t-Test Formula

3. One Sample T Test

4. Hypothesis testing, T-Distribution

5. Student's t-Test (t0, te & H0) Calculator, Formulas & Examples

6. Hypothesis Testing with Two Samples

#### VIDEO

1. Z-Test for Hypothesis Testing

2. Distribution Of Prizes

3. T test Part 1 Hypothesis Set Up and Formula Discussion MBS First Semester Statistics Solution

4. t test for hypothesis testing

5. Statistics 101: Single Sample Hypothesis t-test Examples

6. Excel Statistical Analysis 48: Hypothesis Testing with T Distribution, Two Tail Example

1. T-Distribution

This means that the difference in group means is 12.79 standard deviations away from the mean of the distribution of the null hypothesis. The degrees of freedom is 38 (n-1 for each group). ... Formula and Examples A t test is a statistical test used to compare the means of two groups. The type of t test you use depends on what you want to ...

2. An Introduction to t Tests

An Introduction to t Tests | Definitions, Formula and Examples. Published on January 31, 2020 by Rebecca Bevans.Revised on June 22, 2023. A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from ...

3. How t-Tests Work: t-Values, t-Distributions, and Probabilities

Hypothesis tests work by taking the observed test statistic from a sample and using the sampling distribution to calculate the probability of obtaining that test statistic if the null hypothesis is correct. In the context of how t-tests work, you assess the likelihood of a t-value using the t-distribution.

4. T Test Overview: How to Use & Examples

We'll use a two-sample t test to evaluate if the difference between the two group means is statistically significant. The t test output is below. In the output, you can see that the treatment group (Sample 1) has a mean of 109 while the control group's (Sample 2) average is 100. The p-value for the difference between the groups is 0.112.

5. Understanding t-Tests: t-values and t-distributions

The foundation behind any hypothesis test is being able to take the test statistic from a specific sample and place it within the context of a known probability distribution. For t-tests, if you take a t-value and place it in the context of the correct t-distribution, you can calculate the probabilities associated with that t-value.

6. t-test Calculator

Recall, that in the critical values approach to hypothesis testing, you need to set a significance level, α, before computing the critical values, which in turn give rise to critical regions (a.k.a. rejection regions). Formulas for critical values employ the quantile function of t-distribution, i.e., the inverse of the cdf:. Critical value for left-tailed t-test:

7. 8.2.3.1

For the test of one group mean we will be using a t test statistic: Test Statistic: One Group Mean. t = x ― − μ 0 s n. x ― = sample mean. μ 0 = hypothesized population mean. s = sample standard deviation. n = sample size. Note that structure of this formula is similar to the general formula for a test statistic: s a m p l e s t a t i s ...

8. Student's t-test

Student's t-test is a statistical test used to test whether the difference between the response of two groups is statistically significant or not. It is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis.It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in ...

9. T Test (Student's T-Test): Definition and Examples

The t test tells you how significant the differences between group means are. It lets you know if those differences in means could have happened by chance. The t test is usually used when data sets follow a normal distribution but you don't know the population variance.. For example, you might flip a coin 1,000 times and find the number of heads follows a normal distribution for all trials.

10. T Distribution: Definition & Uses

The essential uses for the t distribution are for finding: P-values for t-tests when testing the mean and for the coefficients in regression analysis. Critical values that define the upper and lower bounds of a confidence interval. Use the t distribution when you need to assess the mean and do not know the population standard deviation.

11. Student's t-distribution

The Student's t distribution plays a role in a number of widely used statistical analyses, including Student's t test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means, and in linear regression analysis .

12. 8.2: Hypothesis Testing with t

As shown in Figure 8.2.1: our critical value is t∗ = 2.353. We can then shade this region on our t -distribution to visualize our rejection region. Step 3: Compute the Test Statistic The four wait times you experienced for your oil changes are the new shop were 46 minutes, 58 minutes, 40 minutes, and 71 minutes.

13. 8.2.3.1

When testing hypotheses about a mean or mean difference, a $$t$$ distribution is used to find the $$p$$-value. These $$t$$ distributions are indexed by a quantity called degrees of freedom, calculated as $$df = n - 1$$ for the situation involving a test of one mean or test of mean difference. The $$p$$-value can be found using Minitab.

14. Student's t Distribution

The t distribution has the following properties: The mean of the distribution is equal to 0 . The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section) and v > 2. The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom. With infinite degrees of freedom, the t ...

15. T Test Formula with Solved Examples

t = 2.3764 = 2.36 (approx) T-Test Formula a statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. For more formulas and derivation, visit BYJU'S.

16. T-Test: What It Is With Multiple Formulas and When To Use Them

T-Test: A t-test is an analysis of two populations means through the use of statistical examination; a t-test with two samples is commonly used with small sample sizes, testing the difference ...

17. One Sample t-test: Definition, Formula, and Example

If the p-value that corresponds to the test statistic t with (n-1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis. One Sample t-test: Assumptions. For the results of a one sample t-test to be valid, the following assumptions should be met:

18. 9.1: Two Sample Mean T-Test for Dependent Groups

The t-test for dependent samples is a statistical test for comparing the means from two dependent populations (or the difference between the means from two populations). The t-test is used when the differences are normally distributed. The samples also must be dependent. The formula for the t-test statistic is: t = D¯−μD (SD n√) t = D ...

19. Independent Samples T Test: Definition, Using & Interpreting

Independent Samples T Tests Hypotheses. Independent samples t tests have the following hypotheses: Null hypothesis: The means for the two populations are equal. Alternative hypothesis: The means for the two populations are not equal.; If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis. The difference between the two means is statistically ...

20. Hypothesis Testing explained in 4 parts

Hypothesis Testing often confuses data scientists due to mixed teachings. This guide clarifies 10 key concepts with visuals and simple explanations for better understanding. ... Without loss of generality*, let's assume the underlying distribution of our null hypothesis is mean 0 and standard deviation 1.

21. 10.2

For the example in hand, the value of the test statistic is: The critical region approach tells us to reject the null hypothesis at the α = 0.05 level if t ≥ t 0.025, 99 = 1.9842 or if t ≤ t 0.025, 99 = − 1.9842. Therefore, we reject the null hypothesis because t = 4.762 > 1.9842, and therefore falls in the rejection region: 1.9842 -1. ...

22. Two Sample t-test: Definition, Formula, and Example

If the p-value that corresponds to the test statistic t with (n 1 +n 2-1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis. Two Sample t-test: Assumptions. For the results of a two sample t-test to be valid, the following assumptions should be met:

23. T-test

A t-test is a statistical method used to compare the means of two groups to determine if there is a significant difference between them. The t-test is a parametric test, meaning it makes certain assumptions about the data. Here are the key prerequisites for conducting a t-test. Hypothesis Testing: