Non-asymptotic oracle inequalities for fully data-driven generalized-inverse shrinkage
Sourced from the work of Taras Bodnar, Nestor Parolya
§ Problem Statement
Setup
Let be i.i.d. with and
where is symmetric positive definite. Work in the high-dimensional regime with . Assume a bounded moment bound for some , e.g.
(If one imposes stronger tails such as sub-Gaussianity, treat that as a deliberate strengthening.)
This setup follows Bodnar & Parolya (2024).
Define
its Moore-Penrose inverse , and ridge inverse for . Let and fix a deterministic symmetric target with bounded operator norm. For and , define [ \widehat\Theta_n^{\mathrm{MP}}(\alpha)=\alpha S_n^\dagger+(1-\alpha)T_n,\qquad \widehat\Theta_n^{\mathrm{R}}(\alpha,\lambda)=\alpha G_n(\lambda)+(1-\alpha)T_n. ]
Unsolved Problem
With loss and risk , seek fully data-driven selectors (measurable in the sample, no population oracle input) such that finite-sample oracle inequalities hold:
and
with explicit non-asymptotic and transparent constants under the stated moment/regime assumptions.
§ Discussion
§ Significance & Implications
The source establishes asymptotic behavior in large-dimensional settings, but practical tuning still needs explicit finite-sample guarantees. Non-asymptotic oracle inequalities would quantify reliability gaps for pseudo-inverse and ridge-type precision estimation in the regime.
§ Known Partial Results
Bodnar et al. (2024): This problem remains open in the specific scope above: fully data-driven tuning with explicit finite-sample oracle inequalities for both Moore-Penrose and ridge-type shrinkage under the and bounded -moment framework. Nearby 2025 non-asymptotic ridge-related results (risk/concentration bounds in adjacent models) reduce technical uncertainty but do not by themselves close this exact oracle-inequality target.
§ References
Taras Bodnar, Nestor Parolya (2024)
arXiv preprint
📍 Section 1 (Introduction), p. 2, second paragraph: “No other results have been derived either for the Moore-Penrose inverse or for the ridge-type inverse in the non-asymptotic setting…”
Primary source motivating this synthesized open problem.