An Examination of Common Misconceptions about Statistical Significance and How to Interpret P-Values Correctly

Introduction

Suppose you have passed an examination. Now, the statistical significance will help you to understand whether you have passed by luck or because of your knowledge. It is the basic idea of this and we will know more about it later in this blog. Suppose you have the confidence that you will pass the result. Then the P-value will measure whether the probability is either the same or not the same as your observation. In this blog, we will examine the common misconceptions about statistical significance and we will also identify the interpretation of the P-values correctly.

Define statistical significance

A set of observed data is said to be statistically significant if it can be proven that it is not the result of chance but rather has a known explanation. For academic fields or practitioners, such as those in the fields of economics, finance, investing, medicine, physics, and biology, statistical significance is crucial.

Strong and weak statistical significance are the two categories. Strong statistical significance contributes to the confidence that the results are real and not the result of chance or luck when examining a data set and performing the relevant tests to determine whether one or more factors have an impact on an outcome.

There is some room for mistakes when calculating statistical significance (significance testing). Even if data seem to be strongly correlated, researchers must take into account the potential that the apparent correlation was the result of a sampling error or pure chance.

Given that larger samples are less likely to be flukes, the sample size is an essential factor in the statistical significance of an observation. For significance testing, only representative samples that were picked at random should be used. The threshold at which one can concur that an event is statistically significant is referred to as the significance level.

Figure 1: Statistical significance

Common misconceptions about statistical significance

The misconceptions not only reduce the quality of the research but also the researchers face many problems while conducting the research. The common misconception about statistical significance have been described below:

P-Hacking

Only when every decision made during data analysis was carried out exactly as planned and documented as part of the experimental design can statistical results be taken at face value. My interactions with scientists lead me to believe that fundamental research publications frequently violate this guideline. Gather and examine some facts first. Collect additional data and conduct a new analysis if the results do not meet statistical significance but yet reveal a difference or trend in the desired direction. Alternatively, you may try an alternative method of data analysis, such as eliminating a few outliers, transforming the data into logarithms, using a nonparametric test, or normalizing the results. The options are virtually limitless. Continue your efforts until you get a statistically significant result, or run out of money, or time.

P values provide information on the size of the effect

To calculate a P value, you must first specify the null hypothesis, which is often that two means are equal. The P values represent the likelihood that, if the null hypothesis were true, an effect would be as large as or larger than what you saw in the current experiment. However, take note that the P value does not notify you of the size of the difference (or effect).

In experimental research, statistical hypothesis testing and reports of "statistical significance" are required

A clear decision can be reached from a single investigation with the help of statistical hypothesis testing. You only take one action if the result is deemed "statistically significant" and the P value is less than a predetermined threshold (typically 0.05). If not, the outcome is termed "not statistically significant," and you must choose another course of action. This is useful for various clinical research and quality control. It is also helpful when you carefully assess how well two realistic scientific models suit your data and select one to serve as your guide for data interpretation and experiment planning.

Figure 2: Forms of statistical significance

Define P-value

The likelihood that your data may have been true under the null hypothesis is indicated by the probability value or p-value. This is done by determining the test statistic's probability, which is the result of a statistical study of your data.

If the null hypothesis of your statistical test were true, the p-value indicates how frequently you can expect to see a test statistic that was as extreme or more extreme than the one you received by your test. The p-value decreases when the test statistic produced by your data deviates further from the distribution of test statistics predicted by the null hypothesis.

Figure 3: P-value

Interpreting P-values correctly

Before interpreting the P-value, let us know its limitations and other factors that can help us to interpret the P-value correctly.

Limitations of p-value

The justification of the specified comparison (i.e., whether the compared groups were comparable, to begin with) cannot be determined by a p-value. The study design needs to determine this basic requirement. For instance, stratified or confounder-adjusted observational studies might somewhat imitate randomized controlled experiments in their pursuit of the ideal of unbiased study group comparability.
The appropriateness of the statistical test that was chosen is disregarded by a p-value. The proper option depends on the information being evaluated, the sample size, the notion of comparison, and the outcome format—all of which must be verified during critical appraisal.

Consideration of other factors

The other factors that are to be taken into consideration are effect size, precision, and prior probability which are described below:

Effect size

The significance of the association between variables or the differentiation between groups is revealed by the effect size. It shows how a research finding applies in the actual world. A large effect size indicates a research finding's practical importance, whereas a small impact size suggests only minor practical ramifications.

While statistical significance demonstrates the existence of an effect in a study, practical significance demonstrates that the effect is significant enough to have real-world implications. P values stand for statistical significance whereas effect sizes stand for practical importance.

Precision

The ability of a group of measurements to be extremely repeatable or the quality of an estimate to have minimal random error. When compared to accuracy, which is the quality of being close to a goal or real value, precision is to be contrasted.

Precision can also refer to a measure of the dispersion of the observations, whether or not the mean value around which the dispersion is measured is close to the "actual" value. This is a slightly more specific definition of precision. The ability of repeated observations to correspond to one another is known as precision, and it is a property associated with a class of measures.

Prior probability

In Bayesian statistics, the prior probability is the likelihood of an event occurring before fresh data are gathered. This is the most logical estimate of the likelihood of an outcome based on the available data prior to an experiment.

As additional information or data becomes available, the prior probability of an event will be updated to create a more precise measurement of a probable outcome. The Bayes theorem is used to determine the posterior probability, which has been refined.

Now, let us come to interpreting the P-value. The ways to interpret the p-value are described below as:

In scientific parlance, a P value is a likelihood, if the null hypothesis is true, of experiencing an effect at least as strong as the one in your sample data.
Assume, for instance, that vaccine research had a P value of 0.04. According to this P value, random sampling error would cause the reported difference or more to be seen in 4% of trials even if the vaccine had no impact.
P values only answer one question: Given a true null hypothesis, how likely are your data? It does not assess how well the alternative hypothesis is supported. This restriction directs us to the next part, where we will discuss a very typical error in P value interpretation.

Figure 3: Null hypothesis in P-value

Conclusion

Finally, we can conclude that statistical significance can help us to interpret p values correctly only if the limitations of p values and other factors are considered. We, at Regent Statistics, can help you to conduct statistical significance using P-values at an affordable cost.

Blog

An Examination of Common Misconceptions about Statistical Significance and How to Interpret P-Values Correctly