For most of my professional life I have thought that “statistically significant” results are marked by a p-value of less than 0.05, and that, if the test is used correctly, everything is cool and dandy and can be published. Well…. “if the test is used correctly” is the important consideration here, and it has always been.
The recent warning about the use of p-values by the American Statistical Association (ASA) made things a bit more difficult. The warning was given because of so many misinterpretations published in high impact journals. In my view, the warning resulted in some reasonable reaction and in some over-reaction. Some journals apparently even do not accept papers containing p-values any longer. Digging a little deeper, we find that The Royal Society, as early as 2014, recommended to not use the word “significant”. What now?
Let’s put things into perspective: Working in the field of medical device technology, we know that a medical diagnosis should never be based on a single number. In particular, if a single number crosses a threshold, it is not the end of the world. If a patient’s diastolic blood pressure was 89 mmHg yesterday and it is 90 mmHg today, it does not mean that the patient was healthy yesterday and that she is sick and needs treatment today. The same principle applied to the p-value will result in careful consideration of circumstances, careful description of the methods used, and careful use of the p-value and its interpretation. And that is all the ASA warning tells us: Apply your methods carefully, consider the circumstances, and disclose what you did.
Personally, I am now more careful when it comes to the word “significant”, but I still use it. Furthermore, I am also more careful with p-values, but I keep on using them.