Statology

Statistics Made Easy

How to Perform a t-test with Unequal Sample Sizes

One question students often have in statistics is:

Is it possible to perform a t-test when the sample sizes of each group are not equal?

The short answer:

Yes, you can perform a t-test when the sample sizes are not equal. Equal sample sizes is not one of the assumptions made in a t-test.

The real issues arise when the two samples do not have equal variances, which is one of the assumptions made in a t-test.

When this occurs, it’s recommended that you use Welch’s t-test instead, which does not make the assumption of equal variances.

The following examples demonstrate how to perform t-tests with unequal sample sizes when the variances are equal and when they’re not equal.

Example 1: Unequal Sample Sizes and Equal Variances

Suppose we administer two programs designed to help students score higher on some exam.

The results are as follows:

  • n (sample size): 500
  • x (sample mean): 80
  • s (sample standard deviation): 5
  • n (sample size): 20
  • x (sample mean): 85

The following code shows how to create a boxplot in R to visualize the distribution of exam scores for each program:

hypothesis testing different sample size

The mean exam score for Program 2 appears to be higher, but the variance of exam scores between the two programs is roughly equal. 

The following code shows how to perform an independent samples t-test along with a Welch’s t-test:

The independent samples t-test returns a p-value of .0009 and Welch’s t-test returns a p-value of .0029 .

Since the p-value of each test is less than .05, we would reject the null hypothesis in each test and conclude that there is a statistically significant difference in mean exam scores between the two programs.

Even though the sample sizes are unequal, the independent samples t-test and Welch’s t-test both return similar results since the two samples had equal variances.

Example 2: Unequal Sample Sizes and Unequal Variances

  • s (sample standard deviation): 25

hypothesis testing different sample size

The mean exam score for Program 2 appears to be higher, but the variance of exam scores for Program 1 is much higher than Program 2.

The independent samples t-test returns a p-value of .5496  and Welch’s t-test returns a p-value of .0361 .

The independent samples t-test is not able to detect a difference in mean exam scores, but the Welch’s t-test is able to detect a statistically significant difference.

Since the two samples had unequal variances, only Welch’s t-test was able to detect the statistically significant difference in mean exam scores since this test does not make the assumption of equal variances between samples.

Additional Resources

The following tutorials provide additional information about t-tests:

Introduction to the One Sample t-test Introduction to the Two Sample t-test Introduction to the Paired Samples t-test

' src=

Published by Zach

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Choosing the Right Statistical Test | Types & Examples

Choosing the Right Statistical Test | Types & Examples

Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Statistical tests are used in hypothesis testing . They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Table of contents

What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

hypothesis testing different sample size

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .

For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

  • whether your data meets certain assumptions.
  • the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

  • Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
  • Homogeneity of variance : the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
  • Normality of data : the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data .

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

  • Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
  • Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

  • Ordinal : represent data with an order (e.g. rankings).
  • Nominal : represent group names (e.g. brands or species names).
  • Binary : represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.

Comparison tests

Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Choosing the right statistical test

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved April 8, 2024, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

25.3 - calculating sample size.

Before we learn how to calculate the sample size that is necessary to achieve a hypothesis test with a certain power, it might behoove us to understand the effect that sample size has on power. Let's investigate by returning to our IQ example.

Example 25-3 Section  

Let \(X\) denote the IQ of a randomly selected adult American. Assume, a bit unrealistically again, that \(X\) is normally distributed with unknown mean \(\mu\) and (a strangely known) standard deviation of 16. This time, instead of taking a random sample of \(n=16\) students, let's increase the sample size to \(n=64\). And, while setting the probability of committing a Type I error to \(\alpha=0.05\), test the null hypothesis \(H_0:\mu=100\) against the alternative hypothesis that \(H_A:\mu>100\).

What is the power of the hypothesis test when \(\mu=108\), \(\mu=112\), and \(\mu=116\)?

Setting \(\alpha\), the probability of committing a Type I error, to 0.05, implies that we should reject the null hypothesis when the test statistic \(Z\ge 1.645\), or equivalently, when the observed sample mean is 103.29 or greater:

\( \bar{x} = \mu + z \left(\dfrac{\sigma}{\sqrt{n}} \right) = 100 +1.645\left(\dfrac{16}{\sqrt{64}} \right) = 103.29\)

Therefore, the power function \K(\mu)\), when \(\mu>100\) is the true value, is:

\( K(\mu) = P(\bar{X} \ge 103.29 | \mu) = P \left(Z \ge \dfrac{103.29 - \mu}{16 / \sqrt{64}} \right) = 1 - \Phi \left(\dfrac{103.29 - \mu}{2} \right)\)

Therefore, the probability of rejecting the null hypothesis at the \(\alpha=0.05\) level when \(\mu=108\) is 0.9907, as calculated here:

\(K(108) = 1 - \Phi \left( \dfrac{103.29-108}{2} \right) = 1- \Phi(-2.355) = 0.9907 \)

And, the probability of rejecting the null hypothesis at the \(\alpha=0.05\) level when \(\mu=112\) is greater than 0.9999, as calculated here:

\( K(112) = 1 - \Phi \left( \dfrac{103.29-112}{2} \right) = 1- \Phi(-4.355) = 0.9999\ldots \)

And, the probability of rejecting the null hypothesis at the \(\alpha=0.05\) level when \(\mu=116\) is greater than 0.999999, as calculated here:

\( K(116) = 1 - \Phi \left( \dfrac{103.29-116}{2} \right) = 1- \Phi(-6.355) = 0.999999... \)

In summary, in the various examples throughout this lesson, we have calculated the power of testing \(H_0:\mu=100\) against \(H_A:\mu>100\) for two sample sizes ( \(n=16\) and \(n=64\)) and for three possible values of the mean ( \(\mu=108\), \(\mu=112\), and \(\mu=116\)). Here's a summary of our power calculations:

As you can see, our work suggests that for a given value of the mean \(\mu\) under the alternative hypothesis, the larger the sample size \(n\), the greater the power \(K(\mu)\) . Perhaps there is no better way to see this than graphically by plotting the two power functions simultaneously, one when \(n=16\) and the other when \(n=64\):

As this plot suggests, if we are interested in increasing our chance of rejecting the null hypothesis when the alternative hypothesis is true, we can do so by increasing our sample size \(n\). This benefit is perhaps even greatest for values of the mean that are close to the value of the mean assumed under the null hypothesis. Let's take a look at two examples that illustrate the kind of sample size calculation we can make to ensure our hypothesis test has sufficient power.

Example 25-4 Section  

corn field

Let \(X\) denote the crop yield of corn measured in the number of bushels per acre. Assume (unrealistically) that \(X\) is normally distributed with unknown mean \(\mu\) and standard deviation \(\sigma=6\). An agricultural researcher is working to increase the current average yield from 40 bushels per acre. Therefore, he is interested in testing, at the \(\alpha=0.05\) level, the null hypothesis \(H_0:\mu=40\) against the alternative hypothesis that \(H_A:\mu>40\). Find the sample size \(n\) that is necessary to achieve 0.90 power at the alternative \(\mu=45\).

As is always the case, we need to start by finding a threshold value \(c\), such that if the sample mean is larger than \(c\), we'll reject the null hypothesis:

That is, in order for our hypothesis test to be conducted at the \(\alpha=0.05\) level, the following statement must hold (using our typical \(Z\) transformation):

\(c = 40 + 1.645 \left( \dfrac{6}{\sqrt{n}} \right) \) (**)

But, that's not the only condition that \(c\) must meet, because \(c\) also needs to be defined to ensure that our power is 0.90 or, alternatively, that the probability of a Type II error is 0.10. That would happen if there was a 10% chance that our test statistic fell short of \(c\) when \(\mu=45\), as the following drawing illustrates in blue:

This illustration suggests that in order for our hypothesis test to have 0.90 power, the following statement must hold (using our usual \(Z\) transformation):

\(c = 45 - 1.28 \left( \dfrac{6}{\sqrt{n}} \right) \) (**)

Aha! We have two (asterisked (**)) equations and two unknowns! All we need to do is equate the equations, and solve for \(n\). Doing so, we get:

\(40+1.645\left(\frac{6}{\sqrt{n}}\right)=45-1.28\left(\frac{6}{\sqrt{n}}\right)\) \(\Rightarrow 5=(1.645+1.28)\left(\frac{6}{\sqrt{n}}\right), \qquad \Rightarrow 5=\frac{17.55}{\sqrt{n}}, \qquad n=(3.51)^2=12.3201\approx 13\)

Now that we know we will set \(n=13\), we can solve for our threshold value c :

\( c = 40 + 1.645 \left( \dfrac{6}{\sqrt{13}} \right)=42.737 \)

So, in summary, if the agricultural researcher collects data on \(n=13\) corn plots, and rejects his null hypothesis \(H_0:\mu=40\) if the average crop yield of the 13 plots is greater than 42.737 bushels per acre, he will have a 5% chance of committing a Type I error and a 10% chance of committing a Type II error if the population mean \(\mu\) were actually 45 bushels per acre.

Example 25-5 Section  

politician

Consider \(p\), the true proportion of voters who favor a particular political candidate. A pollster is interested in testing at the \(\alpha=0.01\) level, the null hypothesis \(H_0:9=0.5\) against the alternative hypothesis that \(H_A:p>0.5\). Find the sample size \(n\) that is necessary to achieve 0.80 power at the alternative \(p=0.55\).

In this case, because we are interested in performing a hypothesis test about a population proportion \(p\), we use the \(Z\)-statistic:

\(Z = \dfrac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}} \)

Again, we start by finding a threshold value \(c\), such that if the observed sample proportion is larger than \(c\), we'll reject the null hypothesis:

That is, in order for our hypothesis test to be conducted at the \(\alpha=0.01\) level, the following statement must hold:

\(c = 0.5 + 2.326 \sqrt{ \dfrac{(0.5)(0.5)}{n}} \) (**)

But, again, that's not the only condition that c must meet, because \(c\) also needs to be defined to ensure that our power is 0.80 or, alternatively, that the probability of a Type II error is 0.20. That would happen if there was a 20% chance that our test statistic fell short of \(c\) when \(p=0.55\), as the following drawing illustrates in blue:

This illustration suggests that in order for our hypothesis test to have 0.80 power, the following statement must hold:

\(c = 0.55 - 0.842 \sqrt{ \dfrac{(0.55)(0.45)}{n}} \) (**)

Again, we have two (asterisked (**)) equations and two unknowns! All we need to do is equate the equations, and solve for \(n\). Doing so, we get:

\(0.5+2.326\sqrt{\dfrac{0.5(0.5)}{n}}=0.55-0.842\sqrt{\dfrac{0.55(0.45)}{n}} \\ 2.326\dfrac{\sqrt{0.25}}{\sqrt{n}}+0.842\dfrac{\sqrt{0.2475}}{\sqrt{n}}=0.55-0.5 \\ \dfrac{1}{\sqrt{n}}(1.5818897)=0.05 \qquad \Rightarrow n\approx \left(\dfrac{1.5818897}{0.05}\right)^2 = 1000.95 \approx 1001 \)

Now that we know we will set \(n=1001\), we can solve for our threshold value \(c\):

\(c = 0.5 + 2.326 \sqrt{\dfrac{(0.5)(0.5)}{1001}}= 0.5367 \)

So, in summary, if the pollster collects data on \(n=1001\) voters, and rejects his null hypothesis \(H_0:p=0.5\) if the proportion of sampled voters who favor the political candidate is greater than 0.5367, he will have a 1% chance of committing a Type I error and a 20% chance of committing a Type II error if the population proportion \(p\) were actually 0.55.

Incidentally, we can always check our work! Conducting the survey and subsequent hypothesis test as described above, the probability of committing a Type I error is:

\(\alpha= P(\hat{p} >0.5367 \text { if } p = 0.50) = P(Z > 2.3257) = 0.01 \)

and the probability of committing a Type II error is:

\(\beta = P(\hat{p} <0.5367 \text { if } p = 0.55) = P(Z < -0.846) = 0.199 \)

just as the pollster had desired.

We've illustrated several sample size calculations. Now, let's summarize the information that goes into a sample size calculation. In order to determine a sample size for a given hypothesis test, you need to specify:

The desired \(\alpha\) level, that is, your willingness to commit a Type I error.

The desired power or, equivalently, the desired \(\beta\) level, that is, your willingness to commit a Type II error.

A meaningful difference from the value of the parameter that is specified in the null hypothesis.

The standard deviation of the sample statistic or, at least, an estimate of the standard deviation (the "standard error") of the sample statistic.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Sample size calculation: Basic principles

Sabyasachi das.

Department of Anaesthesiology and Critical Care, Medical College, Kolkata, West Bengal, India

Mohanchandra Mandal

1 Department of Anaesthesiology and Critical Care, North Bengal Medical College, Sushrutanagar, Darjeeling, West Bengal, India

Addressing a sample size is a practical issue that has to be solved during planning and designing stage of the study. The aim of any clinical research is to detect the actual difference between two groups (power) and to provide an estimate of the difference with a reasonable accuracy (precision). Hence, researchers should do a priori estimate of sample size well ahead, before conducting the study. Post hoc sample size computation is not encouraged conventionally. Adequate sample size minimizes the random error or in other words, lessens something happening by chance. Too small a sample may fail to answer the research question and can be of questionable validity or provide an imprecise answer while too large a sample may answer the question but is resource-intensive and also may be unethical. More transparency in the calculation of sample size is required so that it can be justified and replicated while reporting.

INTRODUCTION

Besides scientific justification and validity, the calculation of sample size (‘just large enough’) helps a medical researcher to assess cost, time and feasibility of his project.[ 1 ] Although frequently reported in anaesthesia journals, the details or the elements of calculation of sample size are not consistently provided by the authors. Sample size calculations reported do not match with replication of sample size in many studies.[ 2 ] Most trials with negative results do not have a large enough sample size. Hence, reporting of statistical power and sample size need to be improved.[ 3 , 4 ] There is a belief that studies with small sample size are unethical if they do not ensure adequate power. However, the truth is that for a study to be ethical in its design, its predicted value must outweigh the projected risks to its participants. In studies, where the risks and inconvenience borne by the participants outweigh the benefits received as a result of participation, it is the projected burden. A study may still be valid if the projected benefit to the society outweighs the burden to society. If there is no burden, then any sample size may be ideal.[ 5 ] Many different approaches of sample size design exist depending on the study design and research question. Moreover, each study design can have multiple sub-designs resulting in different sample size calculation.[ 6 ] Addressing a sample size is a practical issue that has to be solved during planning and designing stage of the study. It may be an important issue in approval or rejection of clinical trial results irrespective of the efficacy.[ 7 ]

By the end of this article, the reader will be able to enumerate the prerequisite for sample size estimation, to describe the common lapses of sample size calculation and the importance of a priori sample size estimation. The readers will be able to define the common terminologies related to sample size calculation.

IMPORTANCE OF PILOT STUDY IN SAMPLE SIZE ESTIMATION

In published literature, relevant data for calculating the sample size can be gleaned from prevalence estimates or event rates, standard deviation (SD) of the continuous outcome, sample size of similar studies with similar outcomes. The idea of approximate ‘effect’ estimates can be obtained by reviewing meta-analysis and clinically meaningful effect. Small pilot study, personal experience, expert opinion, educated guess, hospital registers, unpublished reports support researcher when we have insufficient information in the existing/available literature. A pilot study not only helps in the estimation of sample size but also its primary purpose is to check the feasibility of the study.

The pilot study is a small scale trial run as a pre-test, and it tries out for the proposed major trial. It allows preliminary testing of the hypotheses and may suggest some change, dropping some part or developing new hypotheses so that it can be tested more precisely.[ 8 ] It may address many logistic issues such as checking that instructions are comprehensive, and the investigators are adequately skilled for the trial. The pilot study almost always provides enough data for the researcher to decide whether to go ahead with the main study or to abandon. Many research ideas that seem to show great promise are unproductive when actually carried out. From the findings of pilot study, the researcher may abandon the main study involving large logistic resources, and thus can save a lot of time and money.[ 8 ]

METHODS FOR SAMPLE SIZE CALCULATION

Sample size can be calculated either using confidence interval method or hypothesis testing method. In the former, the main objective is to obtain narrow intervals with high reliability. In the latter, the hypothesis is concerned with testing whether the sample estimate is equal to some specific value.

Null hypothesis

This hypothesis states that there is no difference between the control and the study group in relation to randomized controlled trial (RCT). Rejecting or disproving the null hypothesis – and thus concluding that there are grounds for believing that there is a difference between the two groups, is a central task in the modern practice of science, and gives a precise criterion for rejecting a hypothesis.[ 9 , 10 ]

Alternative hypothesis

This hypothesis is contradictory to null hypothesis, i.e., it assumes that there is a difference among the groups, or there is some association between the predictor and the outcome [ Figure 1 ].[ 9 , 10 ] Sometimes, it is accepted by exclusion if the test of significance rejects the null hypothesis. It may be one-sided (specifies the difference in one direction only) or two-sided (specifies the difference in both directions).

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g001.jpg

Result possibilities during hypothesis tasting. H 0 – Null hypothesis; H 1 – The alternative hypothesis

Type I error (α error) occur if the null hypothesis is rejected when it is true. It represents the chance that the researcher detects a difference between two groups when in reality no difference exists. In other words, it is the chance of false-positive conclusion. A value of 0.05 is most commonly used.

Type II error (β error) is the chance of a false-negative result. The researcher does not detect the difference between the two groups when in reality the difference exists. Conventionally, it is set at a level of 0.20, which translates into <20% chance of a false-negative conclusion. Power is the complement of beta, i.e., (1-beta). In other words, power is 0.80 or 80% when beta is set at 0.20. The power represents the chance of avoiding a false-negative conclusion, or the chance of detecting an effect if it really exists.[ 11 ]

TYPES OF TRIALS

Parallel arm RCTs are most commonly used, that means all participants are randomized in two or more arms of different interventions treated concurrently. Various types of parallel RCTs are used in accordance with the need: Superiority trials which verify whether a new approach is more effective than the conventional from statistically or clinical point of view. Here, the concurrent null hypothesis is that the new approach is not more effective than the conventional approach. Equivalence trials which are designed to ascertain that the new approach and the standard approach are equally effective. Corresponding null hypothesis states that the difference between both approaches is clinically relevant. Non-inferiority trials which are designed to ascertain that the new approach is equal if not superior to the conventional approach. Corresponding null hypothesis is that the new approach is inferior to the conventional one.

PREREQUISITES FOR SAMPLE SIZE ESTIMATION

At the outset, primary objectives (descriptive/analytical) and primary outcome measure (mean/proportion/rates) should be defined. Often there is a primary research question that the researcher wants to investigate. It is important to choose a primary outcome and lock that for the study. The minimum difference that investigator wants to detect between the groups makes the effect size for the sample size calculation.[ 7 ] Hence, if the researcher changes the planned outcome after the start of the study, the reported P value and inference becomes invalid.[ 11 ] The level of acceptable Type I error (α) and Type II error (β) should also be determined. The error rate of Type I error (alpha) is customarily set lower than Type II error (beta). The philosophy behind this is the impact of a false positive error (Type I) is more detrimental than that of false negative (Type II) error. So they are protected against more rigidly.

Besides, the researcher needs to know the control arm mean/event rates/proportion, and the smallest clinically important effect that one is trying to detect.

THE RELATION BETWEEN PRIMARY OBJECTIVE AND THE SAMPLE SIZE

The type of primary outcome measure with its clear definition help computing correct sample size as there are definite ways to reach sample size for each outcome measure. It needs special attention as it principally influences how impressively the research question is answered. The type of primary outcome measure also is the basis for the mode of estimation regarding population variance. For continuous variable (e.g., mean arterial pressure [MAP]), population SD is incorporated in the formula whereas the SD needs to be worked out from the proportion of outcomes for binomial variables (e.g., hypotension - yes/no). In literature, there can be several outcomes for each study design. It is the responsibility of the researcher to find out the primary outcome of the study. Mostly sample size is estimated based on the primary outcome. It is possible to estimate sample size taking into consideration all outcome measures, both primary and secondary at the cost of much larger sample size.

ESSENTIAL COMPONENTS OF SAMPLE SIZE ESTIMATION

The sample size for any study depends on certain factors such as the acceptable level of significance ( P value), power (1 − β) of the study, expected ‘clinically relevant’ effect size, underlying event rate in the population, etc.[ 7 ] Primarily, three factors P value (depends on α), power (related with β) and the effect size (clinically relevant assumption) govern an appropriate sample size.[ 12 , 13 , 14 ] The ‘effect size’ means the magnitude of clinically relevant effect under the alternative hypothesis. It quantifies the difference in the outcomes between the study and control groups. It refers to the smallest difference that would be of clinical importance. Ideally, the basis of effect size selection should be on clinical judgement. It varies with different clinical trials. The researcher has to determine this effect size with scientific knowledge and wisdom. Available previous publications on related topic might be helpful in this regard. ‘Minimal clinically important difference’ is the smallest difference that would be worth testing. Sample size varies inversely with effect size.

The ideal study to make a researcher happy is one where power of the study is high, or in other words, the study has high chance of making a conclusion with reasonable confidence, be it accepting or rejecting null hypothesis.[ 9 ] Sample size matrix, includes different values of sample sizes using varying dimensions of alpha, power (1-β), and effect size. It is often more useful for the research team to choose the sample size number that fits conveniently to the need of the researcher [ Table 1 ].

The matrix showing changes of sample size with varying dimensions of alpha, power (1-β), and effect size

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g002.jpg

FORMULAE AND SOFTWARE

Once these three factors are fixed, there are many ways (formulae, nomogram, tables and software) for estimating the optimum sample size. At present, there are a good number of softwares, available in the internet. It is prudent to be familiar with the instructions of any software to get sample size of one arm of the study. Perhaps the most important step is to check with the most appropriate formula to get a correct sample size. Websites of some of the commonly used softwares are provided in Table 2 .[ 2 , 6 ]

Websites for some useful statistical software

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g003.jpg

The number of formulae for calculating the sample size and power, to answer precisely different study designs and research questions are no less than 100. It is wise to check appropriate formula even while using software. Although there are more than 100 formulae, for RCTs numbers of formulae are limited. It essentially depends on the primary outcome measure such as mean ± SD, rate and proportion.[ 6 ] Interested readers may access all relevant sample size estimation formulae using the relevant links.

Calculating the sample size by comparing two means

A study to see the effect of phenylephrine on MAP as continuous variable after spinal anaesthesia to counteract hypotension.

MAP as continuous variable:

  • n = Sample size in each of the groups

μ2 = Population mean in treatment Group 2

  • μ1−μ2 = The difference the investigator wishes to detect
  • ℧ = Population variance (SD)

b = Conventional multiplier for power = 0.80.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g004.jpg

Value of a = 1.96, b = 0.842 [ Table 3 ]. If a difference of 15 mmHg in MAP is considered between the phenylephrine and the placebo group as clinically significant (μ1− μ2) and be detected with 80% power and a significance level alpha of 0.05.[ 7 ] n = 2 × ([1.96 + 0.842] 2 × 20 2 )/15 2 = 27.9. That means 28 subjects per group is the sample size.

The constant Z values for conventional α and β values

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g005.jpg

Calculating the sample size by comparing two proportions

A study to see the effect of phenylephrine on MAP as a binary variable after spinal anaesthesia to counteract hypotension.

MAP as a binary outcome, below or above 60 mmHg (hypotension – yes/no):

  • n = The sample size in each of the groups
  • p 1 = Proportion of subjects with hypotension in treatment Group 1
  • q 1 = Proportion of subjects without hypotension in treatment Group 1 (1 − p1)
  • p 2 = Proportion of subjects with hypotension in treatment Group 2
  • q 2 = Proportion of subjects without hypotension in treatment Group 2 (1 − p2)
  • x = The difference the investigator wishes to detect
  • a = Conventional multiplier for alpha = 0.05
  • b = Conventional multiplier for power = 0.8.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-652-g006.jpg

Considering a difference of 10% as clinically relevant and from the recent publication the proportion of subjects with hypotension in the treated group will be 20% ( p 1 = 0.2) and in the control group will be 30% ( p 2 = 0.3), and thus q 1 and q 2 are 0.80 and 0.70, respectively.[ 7 ] Assuming a power of 80%, and an alpha of 0.05, i.e. 1.96 for a and 0.842 for b [ Table 3 ] we get:

([1.96 + 0.842] 2 × [0.20 × 0.80 + 0.30 × 0.70])/0.10 2 = 290.5. Thus, 291 is the sample size.

Researcher may follow some measures like using continuous variables as the primary outcome, measuring the outcome precisely or choose outcomes that can be measured properly. Use of a more common outcome, making one-sided hypothesis may help achieving this target. Published literature and pilot studies are the basis of sample size calculation. At times, expert opinions, personal experience with event rates and educated guess becomes helpful. Variance, effect size or event rates may be underestimated during calculation of the sample size at the designing phase. If the investigator realizes that this underestimation has led to ‘too small a sample size’ recalculation can be tried based on interim data.[ 15 ]

Sample size calculation can be guided by previous literature, pilot studies and past clinical experiences. The collaborative effort of the researcher and the statistician is required in this stage. Estimated sample size is not an absolute truth, but our best guess. Issues such as anticipated loss to follow-up, large subgroup analysis and complicated study designs, demands a larger sample size to ensure adequate power throughout the trial. A change in sample size is proportional to variance (square of SD) and inversely proportional to the detected difference.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

8.4: Small Sample Tests for a Population Mean

  • Last updated
  • Save as PDF
  • Page ID 522

Learning Objectives

  • To learn how to apply the five-step test procedure for test of hypotheses concerning a population mean when the sample size is small.

In the previous section hypotheses testing for population means was described in the case of large samples. The statistical validity of the tests was insured by the Central Limit Theorem, with essentially no assumptions on the distribution of the population. When sample sizes are small, as is often the case in practice, the Central Limit Theorem does not apply. One must then impose stricter assumptions on the population to give statistical validity to the test procedure. One common assumption is that the population from which the sample is taken has a normal probability distribution to begin with. Under such circumstances, if the population standard deviation is known, then the test statistic

\[\frac{(\bar{x}-\mu _0)}{\sigma /\sqrt{n}} \nonumber \]

still has the standard normal distribution, as in the previous two sections. If \(\sigma\) is unknown and is approximated by the sample standard deviation \(s\), then the resulting test statistic

\[\dfrac{(\bar{x}-\mu _0)}{s/\sqrt{n}} \nonumber \]

follows Student’s \(t\)-distribution with \(n-1\) degrees of freedom.

Standardized Test Statistics for Small Sample Hypothesis Tests Concerning a Single Population Mean

 If \(\sigma\) is known: \[Z=\frac{\bar{x}-\mu _0}{\sigma /\sqrt{n}} \nonumber \]

If \(\sigma\) is unknown: \[T=\frac{\bar{x}-\mu _0}{s /\sqrt{n}} \nonumber \]

  • The first test statistic (\(\sigma\) known) has the standard normal distribution.
  • The second test statistic (\(\sigma\) unknown) has Student’s \(t\)-distribution with \(n-1\) degrees of freedom.
  • The population must be normally distributed.

The distribution of the second standardized test statistic (the one containing \(s\)) and the corresponding rejection region for each form of the alternative hypothesis (left-tailed, right-tailed, or two-tailed), is shown in Figure \(\PageIndex{1}\). This is just like Figure 8.2.1 except that now the critical values are from the \(t\)-distribution. Figure 8.2.1 still applies to the first standardized test statistic (the one containing (\(\sigma\)) since it follows the standard normal distribution.

ecf5f771ca148089665859c88d8679df.jpg

The \(p\)-value of a test of hypotheses for which the test statistic has Student’s \(t\)-distribution can be computed using statistical software, but it is impractical to do so using tables, since that would require \(30\) tables analogous to Figure 7.1.5, one for each degree of freedom from \(1\) to \(30\). Figure 7.1.6 can be used to approximate the \(p\)-value of such a test, and this is typically adequate for making a decision using the \(p\)-value approach to hypothesis testing, although not always. For this reason the tests in the two examples in this section will be made following the critical value approach to hypothesis testing summarized at the end of Section 8.1, but after each one we will show how the \(p\)-value approach could have been used.

Example \(\PageIndex{1}\)

The price of a popular tennis racket at a national chain store is \(\$179\). Portia bought five of the same racket at an online auction site for the following prices:

\[155\; 179\; 175\; 175\; 161 \nonumber \]

Assuming that the auction prices of rackets are normally distributed, determine whether there is sufficient evidence in the sample, at the \(5\%\) level of significance, to conclude that the average price of the racket is less than \(\$179\) if purchased at an online auction.

  • Step 1 . The assertion for which evidence must be provided is that the average online price \(\mu\) is less than the average price in retail stores, so the hypothesis test is \[H_0: \mu =179\\ \text{vs}\\ H_a: \mu <179\; @\; \alpha =0.05 \nonumber \]
  • Step 2 . The sample is small and the population standard deviation is unknown. Thus the test statistic is \[T=\frac{\bar{x}-\mu _0}{s /\sqrt{n}} \nonumber \] and has the Student \(t\)-distribution with \(n-1=5-1=4\) degrees of freedom.
  • Step 3 . From the data we compute \(\bar{x}=169\) and \(s=10.39\). Inserting these values into the formula for the test statistic gives \[T=\frac{\bar{x}-\mu _0}{s /\sqrt{n}}=\frac{169-179}{10.39/\sqrt{5}}=-2.152 \nonumber \]
  • Step 4 . Since the symbol in \(H_a\) is “\(<\)” this is a left-tailed test, so there is a single critical value, \(-t_\alpha =-t_{0.05}[df=4]\). Reading from the row labeled \(df=4\) in Figure 7.1.6 its value is \(-2.132\). The rejection region is \((-\infty ,-2.132]\).
  • Step 5 . As shown in Figure \(\PageIndex{2}\) the test statistic falls in the rejection region. The decision is to reject \(H_0\). In the context of the problem our conclusion is:

The data provide sufficient evidence, at the \(5\%\) level of significance, to conclude that the average price of such rackets purchased at online auctions is less than \(\$179\).

Rejection Region and Test Statistic

To perform the test in Example \(\PageIndex{1}\) using the \(p\)-value approach, look in the row in Figure 7.1.6 with the heading \(df=4\) and search for the two \(t\)-values that bracket the unsigned value \(2.152\) of the test statistic. They are \(2.132\) and \(2.776\), in the columns with headings \(t_{0.050}\) and \(t_{0.025}\). They cut off right tails of area \(0.050\) and \(0.025\), so because \(2.152\) is between them it must cut off a tail of area between \(0.050\) and \(0.025\). By symmetry \(-2.152\) cuts off a left tail of area between \(0.050\) and \(0.025\), hence the \(p\)-value corresponding to \(t=-2.152\) is between \(0.025\) and \(0.05\). Although its precise value is unknown, it must be less than \(\alpha =0.05\), so the decision is to reject \(H_0\).

Example \(\PageIndex{2}\)

A small component in an electronic device has two small holes where another tiny part is fitted. In the manufacturing process the average distance between the two holes must be tightly controlled at \(0.02\) mm, else many units would be defective and wasted. Many times throughout the day quality control engineers take a small sample of the components from the production line, measure the distance between the two holes, and make adjustments if needed. Suppose at one time four units are taken and the distances are measured as

Determine, at the \(1\%\) level of significance, if there is sufficient evidence in the sample to conclude that an adjustment is needed. Assume the distances of interest are normally distributed.

  • Step 1 . The assumption is that the process is under control unless there is strong evidence to the contrary. Since a deviation of the average distance to either side is undesirable, the relevant test is \[H_0: \mu =0.02\\ \text{vs}\\ H_a: \mu \neq 0.02\; @\; \alpha =0.01 \nonumber \] where \(\mu\) denotes the mean distance between the holes.
  • Step 2 . The sample is small and the population standard deviation is unknown. Thus the test statistic is \[T=\frac{\bar{x}-\mu _0}{s /\sqrt{n}} \nonumber \] and has the Student \(t\)-distribution with \(n-1=4-1=3\) degrees of freedom.
  • Step 3 . From the data we compute \(\bar{x}=0.02075\) and \(s=0.00171\). Inserting these values into the formula for the test statistic gives \[T=\frac{\bar{x}-\mu _0}{s /\sqrt{n}}=\frac{0.02075-0.02}{0.00171\sqrt{4}}=0.877 \nonumber \]
  • Step 4 . Since the symbol in \(H_a\) is “\(\neq\)” this is a two-tailed test, so there are two critical values, \(\pm t_{\alpha/2} =-t_{0.005}[df=3]\). Reading from the row in Figure 7.1.6 labeled \(df=3\) their values are \(\pm 5.841\). The rejection region is \((-\infty ,-5.841]\cup [5.841,\infty )\).
  • Step 5 . As shown in Figure \(\PageIndex{3}\) the test statistic does not fall in the rejection region. The decision is not to reject \(H_0\). In the context of the problem our conclusion is:

The data do not provide sufficient evidence, at the \(1\%\) level of significance, to conclude that the mean distance between the holes in the component differs from \(0.02\) mm.

Rejection Region and Test Statistic

To perform the test in "Example \(\PageIndex{2}\)" using the \(p\)-value approach, look in the row in Figure 7.1.6 with the heading \(df=3\) and search for the two \(t\)-values that bracket the value \(0.877\) of the test statistic. Actually \(0.877\) is smaller than the smallest number in the row, which is \(0.978\), in the column with heading \(t_{0.200}\). The value \(0.978\) cuts off a right tail of area \(0.200\), so because \(0.877\) is to its left it must cut off a tail of area greater than \(0.200\). Thus the \(p\)-value, which is the double of the area cut off (since the test is two-tailed), is greater than \(0.400\). Although its precise value is unknown, it must be greater than \(\alpha =0.01\), so the decision is not to reject \(H_0\).

Key Takeaway

  • There are two formulas for the test statistic in testing hypotheses about a population mean with small samples. One test statistic follows the standard normal distribution, the other Student’s \(t\)-distribution.
  • The population standard deviation is used if it is known, otherwise the sample standard deviation is used.
  • Either five-step procedure, critical value or \(p\)-value approach, is used with either test statistic.
  • Nebraska Medicine

Understanding Hypothesis Testing, Significance Level, Power and Sample Size Calculation

  • Written by Steph Langel
  • Published Apr 4, 2024

hypothesis testing different sample size

This e-module offers an in-depth discussion of the essential components of hypothesis testing: significance levels, statistical power, and sample size calculations, which are fundamental to rigorous research methodology. Learners will develop a comprehensive understanding of designing, interpreting, and evaluating research findings through interactive content and real-world case studies. This will enable them to make well-informed decisions based on statistical best practices. The module’s framework allows a thorough learning experience, starting from fundamental definitions and progressing to the hands-on implementation of statistical ideas. This ensures that learners acquire the essential abilities to conduct ethically appropriate and scientifically valid research.

Course Number

Leave a comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Recommended

hypothesis testing different sample size

Hypothesis Testing and Small Sample Sizes

  • First Online: 01 January 2010

Cite this chapter

Book cover

  • Rand R. Wilcox 2  

4077 Accesses

1 Citations

One of the biggest breakthroughs during the last 40 years has been the derivation of inferential methods that perform well when sample sizes are small. Indeed, some practical problems that seemed insurmountable only a few years ago have been solved. But to appreciate this remarkable achievement, we must first describe the shortcomings of conventional techniques developed during the first half of the 20th century—methods that are routinely used today. At one time it was generally thought that these standard methods are insensitive to violations of assumptions, but a more accurate statement is that they seem to perform reasonably well (in terms of Type I errors) when groups have identical probability curves, or when performing regression with variables that are independent. If, for example, we compare groups that happen to have different probability curves, extremely serious problems can arise. Perhaps the most striking problem is described in Chapter 7, but the problems described here are very serious as well and are certainly relevant to applied work.

  • Null Hypothesis
  • Central Limit Theorem
  • Sample Variance
  • Actual Probability
  • Probability Coverage

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Author information

Authors and affiliations.

Department of Psychology, University of Southern California College of Letters, Arts & Sciences, 3620 S. McClintock Ave., Los Angeles, CA, 90089-1061 SGM 501, USA

Rand R. Wilcox

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Rand R. Wilcox .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Wilcox, R.R. (2010). Hypothesis Testing and Small Sample Sizes. In: Fundamentals of Modern Statistical Methods. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-5525-8_5

Download citation

DOI : https://doi.org/10.1007/978-1-4419-5525-8_5

Published : 15 February 2010

Publisher Name : Springer, New York, NY

Print ISBN : 978-1-4419-5524-1

Online ISBN : 978-1-4419-5525-8

eBook Packages : Mathematics and Statistics Mathematics and Statistics (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

hypothesis testing different sample size

Power and Sample Size Determination

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  
  • |   10  
  • |   11  

On This Page sidebar

Issues in Estimating Sample Size for Hypothesis Testing

Ensuring that a test has high power.

Learn More sidebar

All Modules

In the module on hypothesis testing for means and proportions, we introduced techniques for means, proportions, differences in means, and differences in proportions. While each test involved details that were specific to the outcome of interest (e.g., continuous or dichotomous) and to the number of comparison groups (one, two, more than two), there were common elements to each test. For example, in each test of hypothesis, there are two errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H 0 when in fact it is true.   In the first step of any test of hypothesis, we select a level of significance, α , and α = P(Type I error) = P(Reject H 0 | H 0 is true). Because we purposely select a small value for α , we control the probability of committing a Type I error. The second type of error is called a Type II error and it is defined as the probability we do not reject H 0 when it is false. The probability of a Type II error is denoted β , and β =P(Type II error) = P(Do not Reject H 0 | H 0 is false). In hypothesis testing, we usually focus on power, which is defined as the probability that we reject H 0 when it is false, i.e., power = 1- β = P(Reject H 0 | H 0 is false). Power is the probability that a test correctly rejects a false null hypothesis. A good test is one with low probability of committing a Type I error (i.e., small α ) and high power (i.e., small β, high power).  

Here we present formulas to determine the sample size required to ensure that a test has high power. The sample size computations depend on the level of significance, aα, the desired power of the test (equivalent to 1-β), the variability of the outcome, and the effect size. The effect size is the difference in the parameter of interest that represents a clinically meaningful difference. Similar to the margin of error in confidence interval applications, the effect size is determined based on clinical or practical criteria and not statistical criteria.  

The concept of statistical power can be difficult to grasp. Before presenting the formulas to determine the sample sizes required to ensure high power in a test, we will first discuss power from a conceptual point of view.  

Suppose we want to test the following hypotheses at aα=0.05:  H 0 : μ = 90 versus H 1 : μ ≠ 90. To test the hypotheses, suppose we select a sample of size n=100. For this example, assume that the standard deviation of the outcome is σ=20. We compute the sample mean and then must decide whether the sample mean provides evidence to support the alternative hypothesis or not. This is done by computing a test statistic and comparing the test statistic to an appropriate critical value. If the null hypothesis is true (μ=90), then we are likely to select a sample whose mean is close in value to 90. However, it is also possible to select a sample whose mean is much larger or much smaller than 90. Recall from the Central Limit Theorem (see page 11 in the module on Probability ), that for large n (here n=100 is sufficiently large), the distribution of the sample means is approximately normal with a mean of

If the null hypothesis is true, it is possible to observe any sample mean shown in the figure below; all are possible under H 0 : μ = 90.  

Normal distribution of X when the mean of X is 90. A bell-shaped curve with a value of X-90 at the center.

Rejection Region for Test H 0 : μ = 90 versus H 1 : μ ≠ 90 at α =0.05

Standard normal distribution showing a mean of 90. The rejection areas are in the two tails at the extremes above and below the mean. If the alpha level is 0.05, then each tail accounts for an arean of 0.025.

The areas in the two tails of the curve represent the probability of a Type I Error, α= 0.05. This concept was discussed in the module on Hypothesis Testing .  

Now, suppose that the alternative hypothesis, H 1 , is true (i.e., μ ≠ 90) and that the true mean is actually 94. The figure below shows the distributions of the sample mean under the null and alternative hypotheses.The values of the sample mean are shown along the horizontal axis.  

Two overlapping normal distributions, one depicting the null hypothesis with a mean of 90 and the other showing the alternative hypothesis with a mean of 94. A more complete explanation of the figure is provided in the text below the figure.

If the true mean is 94, then the alternative hypothesis is true. In our test, we selected α = 0.05 and reject H 0 if the observed sample mean exceeds 93.92 (focusing on the upper tail of the rejection region for now). The critical value (93.92) is indicated by the vertical line. The probability of a Type II error is denoted β, and β = P(Do not Reject H 0 | H 0 is false), i.e., the probability of not rejecting the null hypothesis if the null hypothesis were true. β is shown in the figure above as the area under the rightmost curve (H 1 ) to the left of the vertical line (where we do not reject H 0 ). Power is defined as 1- β = P(Reject H 0 | H 0 is false) and is shown in the figure as the area under the rightmost curve (H 1 ) to the right of the vertical line (where we reject H 0 ).  

Note that β and power are related to α, the variability of the outcome and the effect size. From the figure above we can see what happens to β and power if we increase α. Suppose, for example, we increase α to α=0.10.The upper critical value would be 92.56 instead of 93.92. The vertical line would shift to the left, increasing α, decreasing β and increasing power. While a better test is one with higher power, it is not advisable to increase α as a means to increase power. Nonetheless, there is a direct relationship between α and power (as α increases, so does power).

β and power are also related to the variability of the outcome and to the effect size. The effect size is the difference in the parameter of interest (e.g., μ) that represents a clinically meaningful difference. The figure above graphically displays α, β, and power when the difference in the mean under the null as compared to the alternative hypothesis is 4 units (i.e., 90 versus 94). The figure below shows the same components for the situation where the mean under the alternative hypothesis is 98.

Overlapping bell-shaped distributions - one with a mean of 90 and the other with a mean of 98

Notice that there is much higher power when there is a larger difference between the mean under H 0 as compared to H 1 (i.e., 90 versus 98). A statistical test is much more likely to reject the null hypothesis in favor of the alternative if the true mean is 98 than if the true mean is 94. Notice also in this case that there is little overlap in the distributions under the null and alternative hypotheses. If a sample mean of 97 or higher is observed it is very unlikely that it came from a distribution whose mean is 90. In the previous figure for H 0 : μ = 90 and H 1 : μ = 94, if we observed a sample mean of 93, for example, it would not be as clear as to whether it came from a distribution whose mean is 90 or one whose mean is 94.

In designing studies most people consider power of 80% or 90% (just as we generally use 95% as the confidence level for confidence interval estimates). The inputs for the sample size formulas include the desired power, the level of significance and the effect size. The effect size is selected to represent a clinically meaningful or practically important difference in the parameter of interest, as we will illustrate.  

The formulas we present below produce the minimum sample size to ensure that the test of hypothesis will have a specified probability of rejecting the null hypothesis when it is false (i.e., a specified power). In planning studies, investigators again must account for attrition or loss to follow-up. The formulas shown below produce the number of participants needed with complete data, and we will illustrate how attrition is addressed in planning studies.

return to top | previous page | next page

Content ©2020. All Rights Reserved. Date last modified: March 13, 2020. Wayne W. LaMorte, MD, PhD, MPH

Hypothesis Testing

Hypothesis testing is a tool for making statistical inferences about the population data. It is an analysis tool that tests assumptions and determines how likely something is within a given standard of accuracy. Hypothesis testing provides a way to verify whether the results of an experiment are valid.

A null hypothesis and an alternative hypothesis are set up before performing the hypothesis testing. This helps to arrive at a conclusion regarding the sample obtained from the population. In this article, we will learn more about hypothesis testing, its types, steps to perform the testing, and associated examples.

What is Hypothesis Testing in Statistics?

Hypothesis testing uses sample data from the population to draw useful conclusions regarding the population probability distribution . It tests an assumption made about the data using different types of hypothesis testing methodologies. The hypothesis testing results in either rejecting or not rejecting the null hypothesis.

Hypothesis Testing Definition

Hypothesis testing can be defined as a statistical tool that is used to identify if the results of an experiment are meaningful or not. It involves setting up a null hypothesis and an alternative hypothesis. These two hypotheses will always be mutually exclusive. This means that if the null hypothesis is true then the alternative hypothesis is false and vice versa. An example of hypothesis testing is setting up a test to check if a new medicine works on a disease in a more efficient manner.

Null Hypothesis

The null hypothesis is a concise mathematical statement that is used to indicate that there is no difference between two possibilities. In other words, there is no difference between certain characteristics of data. This hypothesis assumes that the outcomes of an experiment are based on chance alone. It is denoted as \(H_{0}\). Hypothesis testing is used to conclude if the null hypothesis can be rejected or not. Suppose an experiment is conducted to check if girls are shorter than boys at the age of 5. The null hypothesis will say that they are the same height.

Alternative Hypothesis

The alternative hypothesis is an alternative to the null hypothesis. It is used to show that the observations of an experiment are due to some real effect. It indicates that there is a statistical significance between two possible outcomes and can be denoted as \(H_{1}\) or \(H_{a}\). For the above-mentioned example, the alternative hypothesis would be that girls are shorter than boys at the age of 5.

Hypothesis Testing P Value

In hypothesis testing, the p value is used to indicate whether the results obtained after conducting a test are statistically significant or not. It also indicates the probability of making an error in rejecting or not rejecting the null hypothesis.This value is always a number between 0 and 1. The p value is compared to an alpha level, \(\alpha\) or significance level. The alpha level can be defined as the acceptable risk of incorrectly rejecting the null hypothesis. The alpha level is usually chosen between 1% to 5%.

Hypothesis Testing Critical region

All sets of values that lead to rejecting the null hypothesis lie in the critical region. Furthermore, the value that separates the critical region from the non-critical region is known as the critical value.

Hypothesis Testing Formula

Depending upon the type of data available and the size, different types of hypothesis testing are used to determine whether the null hypothesis can be rejected or not. The hypothesis testing formula for some important test statistics are given below:

  • z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the size of the sample.
  • t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\). s is the sample standard deviation.
  • \(\chi ^{2} = \sum \frac{(O_{i}-E_{i})^{2}}{E_{i}}\). \(O_{i}\) is the observed value and \(E_{i}\) is the expected value.

We will learn more about these test statistics in the upcoming section.

Types of Hypothesis Testing

Selecting the correct test for performing hypothesis testing can be confusing. These tests are used to determine a test statistic on the basis of which the null hypothesis can either be rejected or not rejected. Some of the important tests used for hypothesis testing are given below.

Hypothesis Testing Z Test

A z test is a way of hypothesis testing that is used for a large sample size (n ≥ 30). It is used to determine whether there is a difference between the population mean and the sample mean when the population standard deviation is known. It can also be used to compare the mean of two samples. It is used to compute the z test statistic. The formulas are given as follows:

  • One sample: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).
  • Two samples: z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing t Test

The t test is another method of hypothesis testing that is used for a small sample size (n < 30). It is also used to compare the sample mean and population mean. However, the population standard deviation is not known. Instead, the sample standard deviation is known. The mean of two samples can also be compared using the t test.

  • One sample: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\).
  • Two samples: t = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing Chi Square

The Chi square test is a hypothesis testing method that is used to check whether the variables in a population are independent or not. It is used when the test statistic is chi-squared distributed.

One Tailed Hypothesis Testing

One tailed hypothesis testing is done when the rejection region is only in one direction. It can also be known as directional hypothesis testing because the effects can be tested in one direction only. This type of testing is further classified into the right tailed test and left tailed test.

Right Tailed Hypothesis Testing

The right tail test is also known as the upper tail test. This test is used to check whether the population parameter is greater than some value. The null and alternative hypotheses for this test are given as follows:

\(H_{0}\): The population parameter is ≤ some value

\(H_{1}\): The population parameter is > some value.

If the test statistic has a greater value than the critical value then the null hypothesis is rejected

Right Tail Hypothesis Testing

Left Tailed Hypothesis Testing

The left tail test is also known as the lower tail test. It is used to check whether the population parameter is less than some value. The hypotheses for this hypothesis testing can be written as follows:

\(H_{0}\): The population parameter is ≥ some value

\(H_{1}\): The population parameter is < some value.

The null hypothesis is rejected if the test statistic has a value lesser than the critical value.

Left Tail Hypothesis Testing

Two Tailed Hypothesis Testing

In this hypothesis testing method, the critical region lies on both sides of the sampling distribution. It is also known as a non - directional hypothesis testing method. The two-tailed test is used when it needs to be determined if the population parameter is assumed to be different than some value. The hypotheses can be set up as follows:

\(H_{0}\): the population parameter = some value

\(H_{1}\): the population parameter ≠ some value

The null hypothesis is rejected if the test statistic has a value that is not equal to the critical value.

Two Tail Hypothesis Testing

Hypothesis Testing Steps

Hypothesis testing can be easily performed in five simple steps. The most important step is to correctly set up the hypotheses and identify the right method for hypothesis testing. The basic steps to perform hypothesis testing are as follows:

  • Step 1: Set up the null hypothesis by correctly identifying whether it is the left-tailed, right-tailed, or two-tailed hypothesis testing.
  • Step 2: Set up the alternative hypothesis.
  • Step 3: Choose the correct significance level, \(\alpha\), and find the critical value.
  • Step 4: Calculate the correct test statistic (z, t or \(\chi\)) and p-value.
  • Step 5: Compare the test statistic with the critical value or compare the p-value with \(\alpha\) to arrive at a conclusion. In other words, decide if the null hypothesis is to be rejected or not.

Hypothesis Testing Example

The best way to solve a problem on hypothesis testing is by applying the 5 steps mentioned in the previous section. Suppose a researcher claims that the mean average weight of men is greater than 100kgs with a standard deviation of 15kgs. 30 men are chosen with an average weight of 112.5 Kgs. Using hypothesis testing, check if there is enough evidence to support the researcher's claim. The confidence interval is given as 95%.

Step 1: This is an example of a right-tailed test. Set up the null hypothesis as \(H_{0}\): \(\mu\) = 100.

Step 2: The alternative hypothesis is given by \(H_{1}\): \(\mu\) > 100.

Step 3: As this is a one-tailed test, \(\alpha\) = 100% - 95% = 5%. This can be used to determine the critical value.

1 - \(\alpha\) = 1 - 0.05 = 0.95

0.95 gives the required area under the curve. Now using a normal distribution table, the area 0.95 is at z = 1.645. A similar process can be followed for a t-test. The only additional requirement is to calculate the degrees of freedom given by n - 1.

Step 4: Calculate the z test statistic. This is because the sample size is 30. Furthermore, the sample and population means are known along with the standard deviation.

z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).

\(\mu\) = 100, \(\overline{x}\) = 112.5, n = 30, \(\sigma\) = 15

z = \(\frac{112.5-100}{\frac{15}{\sqrt{30}}}\) = 4.56

Step 5: Conclusion. As 4.56 > 1.645 thus, the null hypothesis can be rejected.

Hypothesis Testing and Confidence Intervals

Confidence intervals form an important part of hypothesis testing. This is because the alpha level can be determined from a given confidence interval. Suppose a confidence interval is given as 95%. Subtract the confidence interval from 100%. This gives 100 - 95 = 5% or 0.05. This is the alpha value of a one-tailed hypothesis testing. To obtain the alpha value for a two-tailed hypothesis testing, divide this value by 2. This gives 0.05 / 2 = 0.025.

Related Articles:

  • Probability and Statistics
  • Data Handling

Important Notes on Hypothesis Testing

  • Hypothesis testing is a technique that is used to verify whether the results of an experiment are statistically significant.
  • It involves the setting up of a null hypothesis and an alternate hypothesis.
  • There are three types of tests that can be conducted under hypothesis testing - z test, t test, and chi square test.
  • Hypothesis testing can be classified as right tail, left tail, and two tail tests.

Examples on Hypothesis Testing

  • Example 1: The average weight of a dumbbell in a gym is 90lbs. However, a physical trainer believes that the average weight might be higher. A random sample of 5 dumbbells with an average weight of 110lbs and a standard deviation of 18lbs. Using hypothesis testing check if the physical trainer's claim can be supported for a 95% confidence level. Solution: As the sample size is lesser than 30, the t-test is used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) > 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 5, s = 18. \(\alpha\) = 0.05 Using the t-distribution table, the critical value is 2.132 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = 2.484 As 2.484 > 2.132, the null hypothesis is rejected. Answer: The average weight of the dumbbells may be greater than 90lbs
  • Example 2: The average score on a test is 80 with a standard deviation of 10. With a new teaching curriculum introduced it is believed that this score will change. On random testing, the score of 38 students, the mean was found to be 88. With a 0.05 significance level, is there any evidence to support this claim? Solution: This is an example of two-tail hypothesis testing. The z test will be used. \(H_{0}\): \(\mu\) = 80, \(H_{1}\): \(\mu\) ≠ 80 \(\overline{x}\) = 88, \(\mu\) = 80, n = 36, \(\sigma\) = 10. \(\alpha\) = 0.05 / 2 = 0.025 The critical value using the normal distribution table is 1.96 z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) z = \(\frac{88-80}{\frac{10}{\sqrt{36}}}\) = 4.8 As 4.8 > 1.96, the null hypothesis is rejected. Answer: There is a difference in the scores after the new curriculum was introduced.
  • Example 3: The average score of a class is 90. However, a teacher believes that the average score might be lower. The scores of 6 students were randomly measured. The mean was 82 with a standard deviation of 18. With a 0.05 significance level use hypothesis testing to check if this claim is true. Solution: The t test will be used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) < 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 6, s = 18 The critical value from the t table is -2.015 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = \(\frac{82-90}{\frac{18}{\sqrt{6}}}\) t = -1.088 As -1.088 > -2.015, we fail to reject the null hypothesis. Answer: There is not enough evidence to support the claim.

go to slide go to slide go to slide

hypothesis testing different sample size

Book a Free Trial Class

FAQs on Hypothesis Testing

What is hypothesis testing.

Hypothesis testing in statistics is a tool that is used to make inferences about the population data. It is also used to check if the results of an experiment are valid.

What is the z Test in Hypothesis Testing?

The z test in hypothesis testing is used to find the z test statistic for normally distributed data . The z test is used when the standard deviation of the population is known and the sample size is greater than or equal to 30.

What is the t Test in Hypothesis Testing?

The t test in hypothesis testing is used when the data follows a student t distribution . It is used when the sample size is less than 30 and standard deviation of the population is not known.

What is the formula for z test in Hypothesis Testing?

The formula for a one sample z test in hypothesis testing is z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) and for two samples is z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

What is the p Value in Hypothesis Testing?

The p value helps to determine if the test results are statistically significant or not. In hypothesis testing, the null hypothesis can either be rejected or not rejected based on the comparison between the p value and the alpha level.

What is One Tail Hypothesis Testing?

When the rejection region is only on one side of the distribution curve then it is known as one tail hypothesis testing. The right tail test and the left tail test are two types of directional hypothesis testing.

What is the Alpha Level in Two Tail Hypothesis Testing?

To get the alpha level in a two tail hypothesis testing divide \(\alpha\) by 2. This is done as there are two rejection regions in the curve.

IMAGES

  1. Hypothesis Testing 3

    hypothesis testing different sample size

  2. PPT

    hypothesis testing different sample size

  3. Hypothesis Testing:T Test

    hypothesis testing different sample size

  4. Hypothesis Testing- Meaning, Types & Steps

    hypothesis testing different sample size

  5. PPT

    hypothesis testing different sample size

  6. Hypothesis Testing

    hypothesis testing different sample size

VIDEO

  1. Two-Sample Hypothesis Testing

  2. Two-Sample Hypothesis Testing: Dependent Sample

  3. Hypothesis Testing Two Sample Test Chapter 10

  4. ONE SAMPLE HYPOTHESIS TESTING USING SPSS

  5. TWO SAMPLE HYPOTHESIS TESTING IN SPSS

  6. ONE SAMPLE HYPOTHESIS TESTING

COMMENTS

  1. How to Perform a t-test with Unequal Sample Sizes

    Is it possible to perform a t-test when the sample sizes of each group are not equal? The short answer: ... = TRUE) Two Sample t-test data: program1 and program2 t = -3.3348, df = 518, p-value = 0.0009148 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -6.111504 -1.580245 sample ...

  2. Hypothesis Testing

    Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test. Step 4: Decide whether to reject or fail to reject your null hypothesis. Step 5: Present your findings. Other interesting articles. Frequently asked questions about hypothesis testing.

  3. Power and Sample Size Determination

    Sample size estimates for hypothesis testing are often based on achieving 80% or 90% power. ... or 0.51 standard deviation units different. We now substitute the effect size and the appropriate Z values for the selected α and power to compute the sample size. Therefore, a sample of size n=31 will ensure that a two-sided test with α =0.05 has ...

  4. Choosing the Right Statistical Test

    For a statistical test to be valid, your sample size needs to be large enough to ... variables or no difference among sample groups. The test statistic tells you how different two or more groups are from the overall population mean, or how different a linear slope is from the slope predicted by a null hypothesis. Different test statistics are ...

  5. Sample Size and its Importance in Research

    ESTIMATING SAMPLE SIZE. So how large should a sample be? In hypothesis testing studies, this is mathematically calculated, conventionally, as the sample size necessary to be 80% certain of identifying a statistically significant outcome should the hypothesis be true for the population, with P for statistical significance set at 0.05. Some investigators power their studies for 90% instead of 80 ...

  6. Sample size, power and effect size revisited: simplified and practical

    The sample size critically affects the hypothesis and the study design, and there is no straightforward way of calculating the effective sample size for reaching an accurate conclusion. ... Calculation of the sample size. Different methods can be utilized before the onset of the study to calculate the most suitable sample size for the specific ...

  7. 25.3

    25.3 - Calculating Sample Size. Before we learn how to calculate the sample size that is necessary to achieve a hypothesis test with a certain power, it might behoove us to understand the effect that sample size has on power. Let's investigate by returning to our IQ example. Example 25-3. Let \ (X\) denote the IQ of a randomly selected adult ...

  8. 9.8: Effect Size, Sample Size and Power

    In other words, power increases with the sample size. This is illustrated in Figure 11.7, which shows the power of the test for a true parameter of θ=0.7, for all sample sizes N from 1 to 100, where I'm assuming that the null hypothesis predicts that θ 0 =0.5.

  9. 7.1: Basics of Hypothesis Testing

    Figure 7.1.1. Before calculating the probability, it is useful to see how many standard deviations away from the mean the sample mean is. Using the formula for the z-score from chapter 6, you find. z = ¯ x − μo σ / √n = 490 − 500 25 / √30 = − 2.19. This sample mean is more than two standard deviations away from the mean.

  10. How to Calculate Sample Size Needed for Power

    Statistical power and sample size analysis provides both numeric and graphical results, as shown below. The text output indicates that we need 15 samples per group (total of 30) to have a 90% chance of detecting a difference of 5 units. The dot on the Power Curve corresponds to the information in the text output.

  11. Sample size calculation: Basic principles

    Many different approaches of sample size design exist depending on the study design and research question. ... Sample size can be calculated either using confidence interval method or hypothesis testing method. In the former, the main objective is to obtain narrow intervals with high reliability. In the latter, the hypothesis is concerned with ...

  12. Type I & II Errors and Sample Size Calculation in Hypothesis Testing

    Photo by Scott Graham on Unsplash. In the world of statistics and data analysis, hypothesis testing is a fundamental concept that plays a vital role in making informed decisions. In this blog, we will delve deeper into hypothesis testing, specifically focusing on how to reduce type I and type II errors.We will discuss the factors that influence these errors, such as significance level, sample ...

  13. 8.4: Small Sample Tests for a Population Mean

    where μ μ denotes the mean distance between the holes. Step 2. The sample is small and the population standard deviation is unknown. Thus the test statistic is. T = x¯ −μ0 s/ n−−√ T = x ¯ − μ 0 s / n. and has the Student t t -distribution with n − 1 = 4 − 1 = 3 n − 1 = 4 − 1 = 3 degrees of freedom. Step 3.

  14. Understanding Hypothesis Testing, Significance Level, Power and Sample

    This e-module offers an in-depth discussion of the essential components of hypothesis testing: significance levels, statistical power, and sample size calculations, which are fundamental to rigorous research methodology. Learners will develop a comprehensive understanding of designing, interpreting, and evaluating research findings through ...

  15. Hypothesis Testing and Small Sample Sizes

    Hypothesis Testing and Small Sample Sizes. ... If, for example, we compare groups that happen to have different probability curves, extremely serious problems can arise. Perhaps the most striking problem is described in Chapter 7, but the problems described here are very serious as well and are certainly relevant to applied work.

  16. Issues in Estimating Sample Size for Hypothesis Testing

    Suppose we want to test the following hypotheses at aα=0.05: H 0: μ = 90 versus H 1: μ ≠ 90. To test the hypotheses, suppose we select a sample of size n=100. For this example, assume that the standard deviation of the outcome is σ=20. We compute the sample mean and then must decide whether the sample mean provides evidence to support the ...

  17. hypothesis testing

    There are three different potential results. A, B, and C. A has an n=40823 and a mean of 3.04. B has an n=10152 and a mean of 3.51. C has an n=320426 and a mean of 6.60. That said, I'm pretty sure C has the highest mean because it has the largest sample size. - wizkids121.

  18. hypothesis testing

    As the sample sizes diverge, statistical power will converge to α α. This fact actually leads to a different suggestion, which I suspect few people have ever heard of and would probably have trouble getting past reviewers (no offense intended): a compromise power analysis. The idea is relatively straightforward: In any power analysis, α α ...

  19. Sample size determination

    Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample.The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting ...

  20. Hypothesis testing

    hypothesis testing, In statistics, a method for testing how accurately a mathematical model based on one set of data predicts the nature of other data sets generated by the same process. Hypothesis testing grew out of quality control, in which whole batches of manufactured items are accepted or rejected based on testing relatively small samples ...

  21. Sample Size Determination in Hypothesis Testing and Estimation

    Sample data may be collected with different objectives in different scenarios. In some cases, one may wish to collect enough observations that would guarantee minimal prespecified reliability for ...

  22. Hypothesis Testing

    Hypothesis testing uses sample data from the population to draw useful conclusions regarding the population probability distribution. It tests an assumption made about the data using different types of hypothesis testing methodologies. ... Depending upon the type of data available and the size, different types of hypothesis testing are used to ...

  23. hypothesis testing

    I am using a public dataset from DataSUS, I want to see if there are differences between the mean/median for the same type of surgery in different hospitals. The number of surgeries is distributed along the years: For surgery type 1, hospital x has a sample size of 653 from 2011 to 2021 and hospital y has a sample size of 4502 in the same ...