The Seventh Annual P-Hacking Championship concluded last weekend with record attendance, record prize money, and, according to the organizing committee, “record statistical power, though we’re still running the analysis.” Three thousand researchers from 47 countries gathered in a convention center whose name the organizers declined to disclose “for reproducibility reasons” to compete in a 72-hour endurance event centered on the extraction of significance from inherently null datasets.
This year’s datasets were distributed at the opening ceremony in sealed envelopes, each containing a comma-separated file of 500 observations drawn from a standard normal distribution. Contestants were permitted to use any statistical technique available in R, Python, SPSS, or “creative Excel formatting.” Judges evaluated entries on p-value achieved, elegance of the methodological justification, and what the official rubric called “confidence of abstract language,” which in practice meant avoiding hedging phrases such as “may suggest” or “is consistent with” and replacing them with “demonstrates” and “proves conclusively.”
This year’s winner, a third-year doctoral student who asked to be identified only as “Participant 1847,” achieved a p-value of 0.0499 by subsetting the data to the 43 observations with even-numbered row indices, applying a log transformation, comparing means across an undocumented grouping variable discovered through exploratory analysis, and running 312 tests over 67 hours before finding one that crossed the threshold. Her submitted abstract described the result as a “robust and replicable finding,” a characterization that judges awarded full marks for confidence. When asked whether she had corrected for multiple comparisons, she said she “considered it,” which the rulebook classifies as sufficient.
Several entrants were disqualified for achieving significance too early, which the organizing committee considers “suspicious.” One team was eliminated for submitting a preregistered design, which violates competition rules on grounds of sportsmanship. The runner-up, who produced p=0.0487 using a technique he described as “you don’t want to know,” received a cash prize and publication in this journal pending peer review. Reviewers have already been assigned; all three have previously competed in the championship.
Comments coming soon. In the meantime, please direct all grievances to Reviewer #2.