Introduction to Statistics with R

6.2 hypothesis tests, 6.2.1 illustrating a hypothesis test.

Let’s say we have a batch of chocolate bars, and we’re not sure if they are from Theo’s. What can the weight of these bars tell us about the probability that these are Theo’s chocolate?

Now, let’s perform a hypothesis test on this chocolate of an unknown origin.

What is the sampling distribution of the bar weight under the null hypothesis that the bars from Theo’s weigh 40 grams on average? We’ll need to specify the standard deviation to obtain the sampling distribution, and here we’ll use \(\sigma_X = 2\) (since that’s the value we used for the distribution we sampled from).

The null hypothesis is \[H_0: \mu = 40\] since we know the mean weight of Theo’s chocolate bars is 40 grams.

The sample distribution of the sample mean is: \[ \overline{X} \sim {\cal N}\left(\mu, \frac{\sigma}{\sqrt{n}}\right) = {\cal N}\left(40, \frac{2}{\sqrt{20}}\right). \] We can visualize the situation by plotting the p.d.f. of the sampling distribution under \(H_0\) along with the location of our observed sample mean.

model hypothesis testing in r

6.2.2 Hypothesis Tests for Means

6.2.2.1 known standard deviation.

It is simple to calculate a hypothesis test in R (in fact, we already implicitly did this in the previous section). When we know the population standard deviation, we use a hypothesis test based on the standard normal, known as a \(z\) -test. Here, let’s assume \(\sigma_X = 2\) (because that is the standard deviation of the distribution we simulated from above) and specify the alternative hypothesis to be \[ H_A: \mu \neq 40. \] We will the z.test() function from the BSDA package, specifying the confidence level via conf.level , which is \(1 - \alpha = 1 - 0.05 = 0.95\) , for our test:

6.2.2.2 Unknown Standard Deviation

If we do not know the population standard deviation, we typically use the t.test() function included in base R. We know that: \[\frac{\overline{X} - \mu}{\frac{s_x}{\sqrt{n}}} \sim t_{n-1},\] where \(t_{n-1}\) denotes Student’s \(t\) distribution with \(n - 1\) degrees of freedom. We only need to supply the confidence level here:

We note that the \(p\) -value here (rounded to 4 decimal places) is 0.0031, so again, we can detect it’s not likely that these bars are from Theo’s. Even with a very small sample, the difference is large enough (and the standard deviation small enough) that the \(t\) -test can detect it.

6.2.3 Two-sample Tests

6.2.3.1 unpooled two-sample t-test.

Now suppose we have two batches of chocolate bars, one of size 40 and one of size 45. We want to test whether they come from the same factory. However we have no information about the distributions of the chocolate bars. Therefore, we cannot conduct a one sample t-test like above as that would require some knowledge about \(\mu_0\) , the population mean of chocolate bars.

We will generate the samples from normal distribution with mean 45 and 47 respectively. However, let’s assume we do not know this information. The population standard deviation of the distributions we are sampling from are both 2, but we will assume we do not know that either. Let us denote the unknown true population means by \(\mu_1\) and \(\mu_2\) .

model hypothesis testing in r

Consider the test \(H_0:\mu_1=\mu_2\) versus \(H_1:\mu_1\neq\mu_2\) . We can use R function t.test again, since this function can perform one- and two-sided tests. In fact, t.test assumes a two-sided test by default, so we do not have to specify that here.

The p-value is much less than .05, so we can quite confidently reject the null hypothesis. Indeed, we know from simulating the data that \(\mu_1\neq\mu_2\) , so our test led us to the correct conclusion!

Consider instead testing \(H_0:\mu_1=\mu_2\) versus \(H_1:\mu_1\leq\mu_2\) .

As we would expect, this test also rejects the null hypothesis. One-sided tests are more common in practice as they provide a more principled description of the relationship between the datasets. For example, if you are comparing your new drug’s performance to a “gold standard”, you really only care if your drug’s performance is “better” (a one-sided alternative), and not that your drug’s performance is merely “different” (a two-sided alternative).

6.2.3.2 Pooled Two-sample t-test

Suppose you knew that the samples are coming from distributions with same standard deviations. Then it makes sense to carry out a pooled 2 sample t-test. You specify this in the t.test function as follows.

6.2.3.3 Paired t-test

Suppose we take a batch of chocolate bars and stamp the Theo’s logo on them. We want to know if the stamping process significantly changes the weight of the chocolate bars. Let’s suppose that the true change in weight is distributed as a \({\cal N}(-0.3, 0.2^2)\) random variable:

model hypothesis testing in r

Let \(\mu_1\) and \(\mu_2\) be the true means of the distributions of chocolate weights before and after the stamping process. Suppose we want to test \(H_0:\mu_1=\mu_2\) versus \(\mu_1\neq\mu_2\) . We can use the R function t.test() for this by choosing paired = TRUE , which indicates that we are looking at pairs of observations corresponding to the same experimental subject and testing whether or not the difference in distribution means is zero.

We can also perform the same test as a one sample t-test using choc.after - choc.batch .

Notice that we get the exact same \(p\) -value for these two tests.

Since the p-value is less than .05, we reject the null hypothesis at level .05. Hence, we have enough evidence in the data to claim that stamping a chocolate bar significantly reduces its weight.

6.2.4 Tests for Proportions

Let’s look at the proportion of Theo’s chocolate bars with a weight exceeding 38g:

Going back to that first batch of 20 chocolate bars of unknown origin, let’s see if we can test whether they’re from Theo’s based on the proportion weighing > 38g.

Recall from our test on the means that we rejected the null hypothesis that the means from the two batches were equal. In this case, a one-sided test is appropiate, and our hypothesis is:

Null hypothesis: \(H_0: p = 0.85\) . Alternative: \(H_A: p > 0.85\) .

We want to test this hypothesis at a level \(\alpha = 0.05\) .

In R, there is a function called prop.test() that you can use to perform tests for proportions. Note that prop.test() only gives you an approximate result.

Similarly, you can use the binom.test() function for an exact result.

The \(p\) -value for both tests is around 0.18, which is much greater than 0.05. So, we cannot reject the hypothesis that the unknown bars come from Theo’s. This is not because the tests are less accurate than the ones we ran before, but because we are testing a less sensitive measure: the proportion weighing > 38 grams, rather than the mean weights. Also, note that this doesn’t mean that we can conclude that these bars do come from Theo’s – why not?

The prop.test() function is the more versatile function in that it can deal with contingency tables, larger number of groups, etc. The binom.test() function gives you exact results, but you can only apply it to one-sample questions.

6.2.5 Power

Let’s think about when we reject the null hypothesis. We would reject the null hypothesis if we observe data with too small of a \(p\) -value. We can calculate the critical value where we would reject the null if we were to observe data that would lead to a more extreme value.

Suppose we take a sample of chocolate bars of size n = 20 , and our null hypothesis is that the bars come from Theo’s ( \(H_0\) : mean = 40, sd = 2 ). Then for a one-sided test (versus larger alternatives), we can calculate the critical value by using the quantile function in R, specifiying the mean and sd of the sampling distribution of \(\overline X\) under \(H_0\) :

Now suppose we want to calculate the power of our hypothesis test: the probability of rejecting the null hypothesis when the null hypothesis is false. In order to do so, we need to compare the null to a specific alternative, so we choose \(H_A\) : mean = 42, sd = 2 . Then the probability that we reject the null under this specific alternative is

We can use R to perform the same calculations using the power.z.test from the asbio package:

Test a General Linear Hypothesis for a Regression Model

Description.

Test, print, or summarize a general linear hypothesis for a regression model

Test the general linear hypothesis C \hat{\beta} = d for the regression model reg .

The test statistic is obtained from the formula:

f = \frac{(C \hat{\beta} - d)' ( C (X'X)^{-1} C' ) (C \hat{\beta} - d) / r }{ SSE / (n-p) }

r is the number of contrasts contained in C , and

n-p is the model degrees of freedom.

Under the null hypothesis, f will follow a F-distribution with r and n-p degrees of freedom

Object of class c("glh.test","htest") with elements:

When using treatment contrasts (the default) the first level of the factors are subsumed into the intercept term. The estimated model coefficients are then contrasts versus the first level. This should be taken into account when forming contrast matrixes, particularly when computing contrasts that include 'baseline' level of factors.

See the comparison with fit.contrast in the examples below.

Gregory R. Warnes [email protected]

R.H. Myers, Classical and Modern Regression with Applications, 2nd Ed, 1990, p. 105

fit.contrast() , estimable() , stats::contrasts()

IMAGES

  1. Introduction to Hypothesis Testing in R

    model hypothesis testing in r

  2. Hypothesis Tests in R

    model hypothesis testing in r

  3. regression

    model hypothesis testing in r

  4. Hypothesis Testing : Infographics

    model hypothesis testing in r

  5. How to perform hypothesis testing with R?

    model hypothesis testing in r

  6. Introduction to Hypothesis Testing in R

    model hypothesis testing in r

VIDEO

  1. Hypothesis testing difference between population proportion with examples

  2. Hypothesis Testing

  3. P-Value Method For Hypothesis Testing

  4. Intro to Hypothesis Testing in Statistics

  5. Hypothesis Testing Problems

  6. Hypothesis testing and p-values