Unsolved

Uniform risk dominance of transformed Moore-Penrose estimators

Sourced from the work of Taras Bodnar, Nestor Parolya

§ Problem Statement

Setup

For each nNn\in\mathbb{N}, let pnNp_n\in\mathbb{N} satisfy pn/ncp_n/n\to c for some constant c(1,)c\in(1,\infty). Observe independent random vectors X1,n,,Xn,nRpnX_{1,n},\dots,X_{n,n}\in\mathbb{R}^{p_n} of the form Xi,n=Σn1/2Zi,nX_{i,n}=\Sigma_n^{1/2}Z_{i,n}, where ΣnRpn×pn\Sigma_n\in\mathbb{R}^{p_n\times p_n} is symmetric positive definite, E[Zi,n]=0\mathbb{E}[Z_{i,n}]=0, E[Zi,nZi,n]=Ipn\mathbb{E}[Z_{i,n}Z_{i,n}^\top]=I_{p_n}, and moments are uniformly bounded (for example, supn,i,jE(Zi,n)j4+δ<\sup_{n,i,j}\mathbb{E}|(Z_{i,n})_j|^{4+\delta}<\infty for some δ>0\delta>0). Define

Sn=1ni=1nXi,nXi,n.S_n=\frac1n\sum_{i=1}^n X_{i,n}X_{i,n}^\top.

If pn>np_n>n, then rank(Sn)n<pn\operatorname{rank}(S_n)\le n<p_n deterministically, so SnS_n is singular for every sample realization; let SnS_n^\dagger denote its Moore-Penrose pseudoinverse.

This setup follows Bodnar & Parolya (2024).

Let C\mathcal{C} be a prescribed spectral class of covariance sequences Σ=(Σn)n1\Sigma=(\Sigma_n)_{n\ge1}, e.g. eigenvalues uniformly bounded away from 00 and \infty: there exist constants 0<m<M<0<m<M<\infty such that mλmin(Σn)λmax(Σn)Mm\le\lambda_{\min}(\Sigma_n)\le\lambda_{\max}(\Sigma_n)\le M for all nn. For any estimator Θ^n\widehat\Theta_n of Σn1\Sigma_n^{-1}, define Frobenius risk

Rn(Θ^n;Σn)=EΣn ⁣[Θ^nΣn1F2].R_n(\widehat\Theta_n;\Sigma_n)=\mathbb{E}_{\Sigma_n}\!\left[\|\widehat\Theta_n-\Sigma_n^{-1}\|_F^2\right].

Fix a benchmark class B\mathcal{B} of measurable estimators Θ^n(b)\widehat\Theta_n^{(b)}, bBb\in\mathcal{B} (e.g. ridge/linear-shrinkage families such as (Sn+λnIpn)1(S_n+\lambda_n I_{p_n})^{-1} or αnSn+βnIpn\alpha_n S_n^\dagger+\beta_n I_{p_n}, with deterministic or data-driven tuning). A transformed Moore-Penrose estimator is of the form

Θ^n=Tn(Sn),\widehat\Theta_n=T_n(S_n^\dagger),

with measurable data-driven TnT_n.

Source-established result (Bodnar--Parolya, arXiv:2403.15792v2): asymptotic trace-moment formulas are derived and used to construct specific fully data-driven shrinkage estimators with asymptotic quadratic-loss optimality for those constructions.

Unsolved Problem

Determine whether there exists (Tn)(T_n) such that

lim supn supΣC supbB{Rn ⁣(Tn(Sn);Σn)Rn ⁣(Θ^n(b);Σn)}0.\limsup_{n\to\infty}\ \sup_{\Sigma\in\mathcal{C}}\ \sup_{b\in\mathcal{B}} \Big\{R_n\!\big(T_n(S_n^\dagger);\Sigma_n\big)-R_n\!\big(\widehat\Theta_n^{(b)};\Sigma_n\big)\Big\}\le 0.

Equivalently: can a fully data-driven transformation of SnS_n^\dagger uniformly match or beat every estimator in B\mathcal{B} over C\mathcal{C} when pn/nc>1p_n/n\to c>1? This strengthened uniform-domination statement remains open.

§ Discussion

Loading discussion…

§ Significance & Implications

The often-quoted phrase that transformed Moore-Penrose estimators "seem" to perform similarly to or better than benchmarks is verifiably documented in the Linkoping University seminar abstract for this work (not asserted here as a proved theorem statement). Turning that empirical/heuristic claim into a uniform asymptotic dominance theorem would clarify when pseudo-inverse-based precision estimation is provably preferable in high dimensions.

§ Known Partial Results

  • Bodnar et al. (2024): Bodnar--Parolya (arXiv:2403.15792v2) derive high-dimensional asymptotics for weighted trace moments (using partial exponential Bell polynomials) and construct data-driven shrinkage estimators with asymptotic quadratic-loss optimality for their specified targets. These results support strong practical performance, but they do not by themselves establish the synthesized full uniform risk-dominance claim over an arbitrary benchmark class B\mathcal{B}.

§ References

[1]

Reviving pseudo-inverses: Asymptotic properties of large dimensional Moore-Penrose and Ridge-type inverses with applications

Taras Bodnar, Nestor Parolya (2024)

arXiv preprint

📍 arXiv:2403.15792v2 (version-specific source for the technical asymptotic and shrinkage results).

Primary technical source; citation is explicitly version-specific (v2).

[2]

Seminarier i matematisk statistik (Resurrecting pseudo-inverses: Asymptotic properties of large dimensional Moore-Penrose and Ridge-type inverses with applications)

Taras Bodnar (2023)

Linkoping University seminar webpage

📍 Seminar abstract text under the listed talk title (sentence containing 'it seems that its proper transformation (shrinkage) performs similarly to or even outperforms the existing benchmarks ...').

Verifiable location of the 'it seems ... outperforms the existing benchmarks' wording used for significance context.

§ Tags