Unsolved

Finite-sample error-rate control and power guarantees under the linear subspace model

Sourced from the work of Amitay Eldar, Keren Mor Waknin, Samuel Davenport, Tamir Bendory, Armin Schwartzman, Yoel Shkolnisky

§ Problem Statement

Setup

Let n,d,rNn,d,r\in\mathbb N with rdr\le d. Let SRd\mathcal S\subset\mathbb R^d be a known rr-dimensional linear subspace, and fix an orthonormal basis matrix URd×rU\in\mathbb R^{d\times r} so S={Ua:aRr}\mathcal S=\{Ua:a\in\mathbb R^r\}. Observe a 1D signal YRnY\in\mathbb R^n generated by

Y=j=1KPτjxj+ε,Y=\sum_{j=1}^K P_{\tau_j}x_j+\varepsilon,

where K0K\ge 0 is unknown, τj{1,,nd+1}\tau_j\in\{1,\dots,n-d+1\} are unknown object locations, xjS{0}x_j\in\mathcal S\setminus\{0\} are unknown object shapes, PτP_{\tau} inserts a length-dd vector into coordinates τ,,τ+d1\tau,\dots,\tau+d-1 (zero elsewhere), and εN(0,σ2In)\varepsilon\sim N(0,\sigma^2 I_n). Assume a non-overlap/separation condition such as τjτd|\tau_j-\tau_\ell|\ge d for jj\ne \ell, and a minimum signal strength condition xj2μmin\|x_j\|_2\ge \mu_{\min} for all jj.

For each candidate location t{1,,nd+1}t\in\{1,\dots,n-d+1\}, define the local hypothesis

H0,t: no object starts at tvsH1,t: an object xS{0} starts at t.H_{0,t}:\text{ no object starts at }t \quad\text{vs}\quad H_{1,t}:\text{ an object }x\in\mathcal S\setminus\{0\}\text{ starts at }t.

Let T={τ1,,τK}\mathcal T^\star=\{\tau_1,\dots,\tau_K\} be the true object set and D^(Y){1,,nd+1}\widehat{\mathcal D}(Y)\subset\{1,\dots,n-d+1\} be the detection set returned by the specific procedure of Eldar-Mor Waknin-Davenport-Bendory-Schwartzman-Shkolnisky (with either its FWER-control or mFDR-control mode). Define R=D^(Y)R=|\widehat{\mathcal D}(Y)|, V=D^(Y)TV=|\widehat{\mathcal D}(Y)\setminus\mathcal T^\star|, FWER=Pr(V1)\mathrm{FWER}=\Pr(V\ge 1), and

mFDR=E[V]E[max(R,1)].\mathrm{mFDR}=\frac{\mathbb E[V]}{\mathbb E[\max(R,1)]}.

Unsolved Problem

Problem 2024. Prove non-asymptotic, explicit finite-sample guarantees for this concrete procedure, namely bounds of the form

FWERα+δnormFDRα+δn,\mathrm{FWER}\le \alpha+\delta_n \quad\text{or}\quad \mathrm{mFDR}\le \alpha+\delta_n,

together with an explicit finite-sample power guarantee

Pr ⁣(TD^(Y))1βn,\Pr\!\big(\mathcal T^\star\subseteq \widehat{\mathcal D}(Y)\big)\ge 1-\beta_n,

where δn,βn\delta_n,\beta_n are given quantitatively (ideally δn=0\delta_n=0) as functions of (n,d,r,K,σ,μmin,separation)(n,d,r,K,\sigma,\mu_{\min},\text{separation}), under fully stated assumptions on the Gaussian noise law and signal geometry.

See Eldar et al. (2024) for further context.

§ Discussion

Loading discussion…

§ Significance & Implications

This closes the gap between asymptotic theory and practical operating regimes, where cryo-EM datasets are finite and often highly noisy. Quantitative finite-sample guarantees are essential for principled parameter tuning and for comparing detection methods in realistic experimental settings. See Eldar et al. (2024) for details.

§ Known Partial Results

  • Eldar et al. (2024): According to the abstract, the method is asymptotically guaranteed to detect all objects while controlling FWER or mFDR. Numerical simulations indicate strong non-asymptotic behavior, but no explicit finite-sample theorem is stated in the abstract.

§ References

[1]

Object detection under the linear subspace model with application to cryo-EM images

Amitay Eldar, Keren Mor Waknin, Samuel Davenport, Tamir Bendory, Armin Schwartzman, Yoel Shkolnisky (2024)

Annals of Statistics (to appear)

📍 Section 5 (Conclusions and future research), paragraph outlining open problems on finite-sample error-rate control.

Source paper where this problem appears.

§ Tags