How to minimize errors and maximize results in your hypothesis tests
Knowing how to set up and conduct a hypothesis test is a critical skill for any aspiring data scientist. It can feel confusing at first trying to make sense of alpha, beta, power, and type I or II errors. My goal in this article is to help you build intuition and provide some visual references.
First, let’s envision setting up a standard A/B experiment where the A group is the control and B is the experimental group. Our null hypothesis is that the two groups are equal and the change applied to group B did not have a significant effect (A = B). Our alternate hypothesis is that the two groups are not the same and that the change applied to group B did, in fact, cause a significant difference (A ≠ B). We could visualize the sampling distributions to look something like this:
The confidence level (CL) refers to how sure we want to be before we reject the null hypothesis (how sure do we want to be before we say that the experiment had a significant impact and to implement the changes of group B). This is picked before hand and is expressed as a probability percentage. Do we want to be 95% sure in order to reject the null? Maybe we need to be 99% sure. The confidence level will depend on your test and how serious the consequences would be if you were wrong. Generally, the standard starting confidence level value is 95% (.95).
The alpha value is expressed as 1-CL. If the confidence level was .95 then the alpha value would be .05 or 5%. This represents the probability that we are willing to reject the null hypothesis when it is actually correct. In other words, if we had an alpha of 5% then we are willing to live with a 5% chance that we will conclude that there is a difference when there really isn’t. Making an error like this would be called a false positive or type I error. Let’s look at our picture again to get a visual intuition.