Unsolved

Finite-sample error-rate control and power guarantees under the linear subspace model

Mathematical Statistics Learning Theory Information Theory

Sourced from the work of Amitay Eldar, Keren Mor Waknin, Samuel Davenport, Tamir Bendory, Armin Schwartzman, Yoel Shkolnisky

§ Problem Statement

Setup

Let $n,d,r\in\mathbb N$ with $r\le d$ . Let $\mathcal S\subset\mathbb R^d$ be a known $r$ -dimensional linear subspace, and fix an orthonormal basis matrix $U\in\mathbb R^{d\times r}$ so $\mathcal S=\{Ua:a\in\mathbb R^r\}$ . Observe a 1D signal $Y\in\mathbb R^n$ generated by

Y=\sum_{j=1}^K P_{\tau_j}x_j+\varepsilon,

where $K\ge 0$ is unknown, $\tau_j\in\{1,\dots,n-d+1\}$ are unknown object locations, $x_j\in\mathcal S\setminus\{0\}$ are unknown object shapes, $P_{\tau}$ inserts a length- $d$ vector into coordinates $\tau,\dots,\tau+d-1$ (zero elsewhere), and $\varepsilon\sim N(0,\sigma^2 I_n)$ . Assume a non-overlap/separation condition such as $|\tau_j-\tau_\ell|\ge d$ for $j\ne \ell$ , and a minimum signal strength condition $\|x_j\|_2\ge \mu_{\min}$ for all $j$ .

For each candidate location $t\in\{1,\dots,n-d+1\}$ , define the local hypothesis

H_{0,t}:\text{ no object starts at }t \quad\text{vs}\quad H_{1,t}:\text{ an object }x\in\mathcal S\setminus\{0\}\text{ starts at }t.

Let $\mathcal T^\star=\{\tau_1,\dots,\tau_K\}$ be the true object set and $\widehat{\mathcal D}(Y)\subset\{1,\dots,n-d+1\}$ be the detection set returned by the specific procedure of Eldar-Mor Waknin-Davenport-Bendory-Schwartzman-Shkolnisky (with either its FWER-control or mFDR-control mode). Define $R=|\widehat{\mathcal D}(Y)|$ , $V=|\widehat{\mathcal D}(Y)\setminus\mathcal T^\star|$ , $\mathrm{FWER}=\Pr(V\ge 1)$ , and

\mathrm{mFDR}=\frac{\mathbb E[V]}{\mathbb E[\max(R,1)]}.

Unsolved Problem

Problem 2024. Prove non-asymptotic, explicit finite-sample guarantees for this concrete procedure, namely bounds of the form

\mathrm{FWER}\le \alpha+\delta_n \quad\text{or}\quad \mathrm{mFDR}\le \alpha+\delta_n,

together with an explicit finite-sample power guarantee

\Pr\!\big(\mathcal T^\star\subseteq \widehat{\mathcal D}(Y)\big)\ge 1-\beta_n,

where $\delta_n,\beta_n$ are given quantitatively (ideally $\delta_n=0$ ) as functions of $(n,d,r,K,\sigma,\mu_{\min},\text{separation})$ , under fully stated assumptions on the Gaussian noise law and signal geometry.

See Eldar et al. (2024) for further context.

§ Discussion

Loading discussion…

§ Significance & Implications

This closes the gap between asymptotic theory and practical operating regimes, where cryo-EM datasets are finite and often highly noisy. Quantitative finite-sample guarantees are essential for principled parameter tuning and for comparing detection methods in realistic experimental settings. See Eldar et al. (2024) for details.

§ Known Partial Results

Eldar et al. (2024): According to the abstract, the method is asymptotically guaranteed to detect all objects while controlling FWER or mFDR. Numerical simulations indicate strong non-asymptotic behavior, but no explicit finite-sample theorem is stated in the abstract.

§ References

[1]

Object detection under the linear subspace model with application to cryo-EM images

Amitay Eldar, Keren Mor Waknin, Samuel Davenport, Tamir Bendory, Armin Schwartzman, Yoel Shkolnisky (2024)

Annals of Statistics (to appear)

📍 Section 5 (Conclusions and future research), paragraph outlining open problems on finite-sample error-rate control.

Source paper where this problem appears.

Link ↗arXiv ↗

§ Tags

finite-sample-theory fdr-control fwer-control multiple-testing linear-subspace-model cryo-em

← Browse All Problems