Unsolved

Finite-sample optimal FDR-FNR frontier under the two-group model

Sourced from the work of Yutong Nie, Yihong Wu

§ Problem Statement

Setup

Fix a finite integer n1n \ge 1. For each i{1,,n}i \in \{1,\dots,n\}, let θi{0,1}\theta_i \in \{0,1\} be the latent hypothesis state, where θi=0\theta_i=0 means null and θi=1\theta_i=1 means non-null. Assume (θi)i=1n(\theta_i)_{i=1}^n are i.i.d. with P(θi=1)=π1(0,1)\mathbb P(\theta_i=1)=\pi_1 \in (0,1) and π0=1π1\pi_0=1-\pi_1. Conditional on θi\theta_i, the test statistic XiX_i takes values in a measurable space (X,A)(\mathcal X,\mathcal A) and has distribution

Xi(θi=0)P0,Xi(θi=1)P1,X_i \mid (\theta_i=0) \sim P_0,\qquad X_i \mid (\theta_i=1) \sim P_1,

where P0,P1P_0,P_1 are known and dominated by a common measure (densities f0,f1f_0,f_1 may be used). Assume conditional independence across ii: given (θ1,,θn)(\theta_1,\dots,\theta_n), the variables X1,,XnX_1,\dots,X_n are independent.

This setup follows Nie & Wu (2023).

A (possibly compound, randomized) multiple-testing rule is any measurable map δn\delta_n that, from (X1,,Xn)(X_1,\dots,X_n) and an auxiliary random seed independent of the data, outputs decisions Di{0,1}D_i \in \{0,1\} (Di=1D_i=1 means reject HiH_i). Define

R=i=1nDi,V=i=1n(1θi)Di,T=i=1nθi(1Di).R=\sum_{i=1}^n D_i,\quad V=\sum_{i=1}^n (1-\theta_i)D_i,\quad T=\sum_{i=1}^n \theta_i(1-D_i).

The false discovery rate and false non-discovery rate are

FDR(δn)=E ⁣[VR1],FNR(δn)=E ⁣[T(nR)1],\mathrm{FDR}(\delta_n)=\mathbb E\!\left[\frac{V}{R\vee 1}\right],\qquad \mathrm{FNR}(\delta_n)=\mathbb E\!\left[\frac{T}{(n-R)\vee 1}\right],

where expectations are under the full two-group model (including rule randomization).

For α(0,1)\alpha\in(0,1), define the finite-sample constrained objective

Ψn(α)=infδn:FDR(δn)αFNR(δn).\Psi_n(\alpha)=\inf_{\delta_n:\,\mathrm{FDR}(\delta_n)\le \alpha}\mathrm{FNR}(\delta_n).

Unsolved Problem

Determine Ψn(α)\Psi_n(\alpha) exactly (or derive non-asymptotic minimax-sharp upper and lower bounds) as an explicit function of (n,α,π1,P0,P1)(n,\alpha,\pi_1,P_0,P_1), and characterize all optimal rules δn\delta_n^\star that attain (or provably approximate sharply) this infimum, allowing fully compound and randomized procedures.

§ Discussion

Loading discussion…

§ Significance & Implications

Nie and Wu's 2023 preprint establishes asymptotic limits as nn\to\infty, but does not provide a complete exact finite-nn frontier characterization. Treating the finite-sample frontier as an open objective is therefore the conservative reading; resolving it would quantify finite-vs-asymptotic gaps and guide practical procedure design.

§ Known Partial Results

  • Nie et al. (2023): The cited work characterizes asymptotically optimal FDR-FNR tradeoffs under the two-group random-mixture model and shows compound rules are necessary for asymptotic optimality (in contrast to mFDR-mFNR). The exact finite-sample frontier characterization is treated as open.

§ References

[1]

Large-Scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle

Yutong Nie, Yihong Wu (2023)

arXiv preprint

📍 arXiv:2302.06809v3 PDF, Section 1.1 (Background and problem formulation), p. 3, immediately after Eq. (2) defining $FNR_n^*(\alpha)$: "it still remains open how to find the optimal decision rule to characterize the finite-sample tradeoff between FDR and FNR."

Primary preprint source for the asymptotic frontier and explicit finite-sample open-direction wording.

[2]

Large-Scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle

Yutong Nie, Yihong Wu (2026)

Annals of Statistics 54(1):232-264

📍 Project Euclid article metadata/citation page for Annals of Statistics, Vol. 54, No. 1 (2026), pp. 232-264.

Final journal publication metadata, kept separate from the 2023 preprint record.

§ Tags