Contents
Whether you’re in high school or at an MBA program, hypothesis testing can be a confusing concept to grasp. In this article, we break down hypothesis testing in statistics and show you how it works step by step.
How to Conduct Hypothesis Testing in Statistics
Step 1: Formulate the null and alternative hypotheses
The null hypothesis, often denoted H0, is the hypothesis that states there is no difference between two measured phenomena or no association between two variables.
The alternative hypothesis, denoted H1 or Ha, is the hypothesis that states there is a difference between two measured phenomena or an association between two variables. Formulating hypotheses can be tricky and may require some thinking.
Consider this example:
Does playing video games have any effect on grades?
The null hypothesis might be Playing video games has no effect on grades while the alternative might be Playing video games has an effect on grades.
Either way, these are just examples of possible formulations for each side of the argument.
Once you have decided what your formulation will be, make sure to write them out clearly and with words so others understand as well.
Step 2: State the assumptions necessary for you to carry out your test
In order for a hypothesis test to work, certain assumptions must be made about the population from which the sample was drawn.
These assumptions include assuming that the distribution of values in a population follows a normal distribution, assuming that observations are independent of one another (no correlations), and assuming there is no selection bias. You also need to assume whether the parameter of interest is dependent or not.
It can be difficult to know all of these things when designing experiments, but there are ways to reduce risks when carrying out tests. For instance, researchers use control groups when they don’t know everything about their populations beforehand.
A control group is a group whose members were chosen randomly and were not exposed to whatever treatment or experimental variable that the other members were exposed to.
There are many more steps involved in performing a good experiment than simply stating hypotheses; making sure your assumptions are sound is crucial if you want accurate results.
Step 3: Select which test you will use
Hypothesis testing in statistics typically falls into one of three categories: single-sample t-tests, paired-samples t-tests, and nonparametric methods.
Single-sample t-tests compare data from a single sample to data from a hypothesized population mean to see if the sample is significantly different from the hypothesized mean.
Paired sample tests compare data from matched pairs within subjects who receive different treatments, whereas nonparametric methods use techniques such as rank order and sign tests instead of traditional parametric methods like single-sample t-tests.
When deciding which test to use, it’s important to consider your data set. If the sampled units come from the same population, then a paired-samples t-test would be appropriate. If there are differences between those sampled units and if you’re using equal n’s across conditions then you should go with single-sample t-tests.
On the other hand, if there are many statistical assumptions that you cannot validate, then you should use nonparametric methods. Nonparametric methods allow for distributions that do not follow parametric criteria. They usually involve ranking or ordering items rather than quantifying items.
For example, the Friedman test is a nonparametric method used to measure changes in treatment. It is the most widely used nonparametric method and consists of comparing all the ranks between each pair of cases.
Step 4: State your decision rule based on your decision criteria
Your decision rule should be based on your decision criteria.
For example, you might decide that if a P-value is less than 0.05, then you will reject your null hypothesis, or you might decide that if a chi-square value is greater than three in magnitude, then you will reject your null hypothesis.
Again, this process depends on the nature of the research question.
Remember, once you’ve determined how to proceed, always state why your decision rules were followed.
Decision rules are specific to the researcher and the type of problem being studied.
For example, if you are interested in determining if there is a correlation between drinking tea and being healthy, then you could hypothesize that people who drink tea are healthier than people who don’t drink tea.
Then, you could run an experiment where you gather 100 adults and split them into two equally sized groups: one half drinks tea every day for 1 month and the other half does not drink tea at all during that time period. After 1 month has passed, both groups take a health survey to determine if there is any difference in health outcomes between the two groups.
The idea behind this experiment is to find out if drinking tea improves health or not.
The P-value of this study would tell us whether we have enough evidence to support our hypothesis by seeing how likely it is for our result to occur by chance alone given what we already know about sampling error and so forth
Step 5: Conduct your test and interpret the results
Hypothesis testing in statistics differs depending on the situation.
The best choice is based on these factors: how similar your sampled units are to a particular population (are they actually part of that population?), the number of samples you want to analyze (equal or unequal?), and the number of assumptions you can make about your data set (can we assume anything?).
Once you select a test, it’s important to state why you made that decision. For example, if you are analyzing a dataset and you are having difficulty interpreting the data, then a nonparametric analysis may be more appropriate.
This is because, in some situations, the researcher needs to work around the assumption that the distribution of your data follows certain parametric guidelines. An example of this is if you’re doing a study looking at poverty and income levels.
Since everyone in your sample lives in the same geographical location, it doesn’t really matter if they are poor or wealthy – but if you were looking at people living in different parts of the world where wealth is much more variable, then you would need to account for that.
Nonparametric analyses often rely on rankings or ordinal scores instead of quantitative scores.
For example, rather than using statistical tests like t-tests or ANOVA which require numerical values, a nonparametric analysis might use the Mann-Whitney U Test which uses ranks.
Therefore, when ranking participants from highest score to lowest score, participants with low scores get ranked first and high scores get ranked last. The Mann-Whitney U test determines if there is a significant difference between the median rank for Group A vs Group B.
It gives one final statistic called W which compares the ranks of each group. There are also many other types of tests outside of those described here including Friedman’s test and Kolmogorov-Smirnov tests.
Hypothesis Testing in Statistics Final Remarks
Effective hypothesis testing in statistics starts with a well-defined problem. Is your goal to answer does A cause B? or is there a relationship between A and B? You must know the point of your investigation in order to be able to design the correct test.
Depending on your goals, you can then select the appropriate test.
Ultimately, choosing the right test depends on your available resources and what information you have access to. If you are unsure of what type of test should be used, consult a professional statistician who will be able to guide you through the process as well as advise whether or not your chosen test is suitable for the task at hand.