Multiple Comparisons

Multiple Comparisons

While you’re waiting

What is the expected number of false positives if you conduct 20 hypothesis tests, each at a significance level of \(\alpha = .05\)?

Said another way, if in reality nothing is going on in all 20 tests, what’s the expected number of tests that would detect something going on?

1

Making Paper

strength hard time pressure
196.6 2 3hrs 400
196.0 2 3hrs 400
198.5 4 3hrs 400
197.2 4 3hrs 400
197.5 8 3hrs 400
196.6 8 3hrs 400
197.7 2 3hrs 500
196.0 2 3hrs 500
196.0 4 3hrs 500
196.9 4 3hrs 500
195.6 8 3hrs 500
196.2 8 3hrs 500
199.8 2 3hrs 650
199.4 2 3hrs 650
198.4 4 3hrs 650
197.6 4 3hrs 650
197.4 8 3hrs 650
198.1 8 3hrs 650
198.4 2 4hrs 400
198.6 2 4hrs 400
197.5 4 4hrs 400
198.1 4 4hrs 400
197.6 8 4hrs 400
198.4 8 4hrs 400
199.6 2 4hrs 500
200.4 2 4hrs 500
198.7 4 4hrs 500
198.0 4 4hrs 500
197.0 8 4hrs 500
197.8 8 4hrs 500
200.6 2 4hrs 650
200.9 2 4hrs 650
199.6 4 4hrs 650
199.0 4 4hrs 650
198.5 8 4hrs 650
199.8 8 4hrs 650

EDA

Three way ANOVA

anova <- aov(strength ~ time * pressure * hard, data = paper)
anova_tab <- summary(anova)
anova_tab
                   Df Sum Sq Mean Sq F value   Pr(>F)    
time                1 20.250  20.250  55.395 6.75e-07 ***
pressure            2 19.374   9.687  26.499 4.33e-06 ***
hard                2  7.764   3.882  10.619   0.0009 ***
time:pressure       2  2.195   1.097   3.002   0.0750 .  
time:hard           2  2.082   1.041   2.847   0.0843 .  
pressure:hard       4  6.091   1.523   4.166   0.0146 *  
time:pressure:hard  4  1.973   0.493   1.350   0.2903    
Residuals          18  6.580   0.366                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

How many hypothesis tests are being conducted in this ANOVA table?

7

Bonferroni

Original p-values:

anova_results <- data.frame(estimate = row.names(anova_tab[[1]])[1:7],
                            p_vals = anova_tab[[1]]$`Pr(>F)`[1:7])
anova_results
            estimate       p_vals
1 time               6.745340e-07
2 pressure           4.327241e-06
3 hard               8.995614e-04
4 time:pressure      7.495643e-02
5 time:hard          8.425969e-02
6 pressure:hard      1.462624e-02
7 time:pressure:hard 2.903053e-01

Bonferroni

Adjust p-values by multiplying them by \(K\).

anova_results <- anova_results |>
  mutate(bon_adjusted = pmin(p_vals * 7, 1))
anova_results
            estimate       p_vals bon_adjusted
1 time               6.745340e-07 4.721738e-06
2 pressure           4.327241e-06 3.029069e-05
3 hard               8.995614e-04 6.296930e-03
4 time:pressure      7.495643e-02 5.246950e-01
5 time:hard          8.425969e-02 5.898178e-01
6 pressure:hard      1.462624e-02 1.023837e-01
7 time:pressure:hard 2.903053e-01 1.000000e+00

Holm’s Procedure

Original sorted p-values:

sorted_pvals <- sort(anova_results$p_vals)
sorted_pvals
[1] 6.745340e-07 4.327241e-06 8.995614e-04 1.462624e-02 7.495643e-02
[6] 8.425969e-02 2.903053e-01

Holm’s Procedure

Working from smallest to largest p-values (call the rank \(j\)), adjust p-values to max of \(K - j + 1\) times the original p-value or the max of previous p-values.

holm <- rep(NA, length(sorted_pvals)) # initialize
previous_max <- sorted_pvals[1]
K <- length(sorted_pvals)
for (j in 1:K) {
  holm[j] <- max(sorted_pvals[j] * (K - j + 1), previous_max, na.rm = TRUE)
  holm[j] <- max(sorted_pvals[j] * (K - j + 1), previous_max, na.rm = TRUE)
  previous_max <- max(holm, na.rm = TRUE)
}

Holm’s Procedure

anova_results <- anova_results |>
  arrange(p_vals) |>
  mutate(holm)
anova_results
            estimate       p_vals bon_adjusted         holm
1 time               6.745340e-07 4.721738e-06 4.721738e-06
2 pressure           4.327241e-06 3.029069e-05 2.596345e-05
3 hard               8.995614e-04 6.296930e-03 4.497807e-03
4 pressure:hard      1.462624e-02 1.023837e-01 5.850495e-02
5 time:pressure      7.495643e-02 5.246950e-01 2.248693e-01
6 time:hard          8.425969e-02 5.898178e-01 2.248693e-01
7 time:pressure:hard 2.903053e-01 1.000000e+00 2.903053e-01

R Shortcut

p.adjust() can adjust p-values using many different methods including:

  • Bonferroni and Holm
  • False discovery rate: Benjamini & Hochberg, Benjamini & Yekutieli
sapply(c("none", "holm", "bonferroni"), p.adjust, p = sorted_pvals)
             none         holm   bonferroni
[1,] 6.745340e-07 4.721738e-06 4.721738e-06
[2,] 4.327241e-06 2.596345e-05 3.029069e-05
[3,] 8.995614e-04 4.497807e-03 6.296930e-03
[4,] 1.462624e-02 5.850495e-02 1.023837e-01
[5,] 7.495643e-02 2.248693e-01 5.246950e-01
[6,] 8.425969e-02 2.248693e-01 5.898178e-01
[7,] 2.903053e-01 2.903053e-01 1.000000e+00