Multiple t-tests
Premium
Question: Say you are testing hundreds of hypotheses, each with a t-test. What considerations would you take into account when doing this?
Running many hypothesis tests increases the chance of Type I errors (false positives).
For example, with tests at , we expect around false positives just by chance.
To control for this, we apply Bonferroni correction, where the significance threshold becomes (e.g., ). While this reduces Type I error, it increases Type II error (false negatives), especially when many hypotheses are tested.
Bonferroni is best used when only a few comparisons are expected to be significant. Other corrections like False Discovery Rate () may be used to balance sensitivity and specificity better.
I'd recommend to adjust p-values because of the increased chance of type I errors when conducting a large number of hypothesis. My recommended adjustment approach would be the Benjamini-Hochberg (BH) over the Bonferroni because BH strikes a balance between controlling for false positive and maintaining statistical power whereas Bonferroni is overly conservative while still controlling for false positives, it leads to a higher chance of missing true effects (high type II error).