Unsolved

Uniform Inference for High-Dimensional Linear Models

§ Problem Statement

Setup

For each sample size nn, observe i.i.d. pairs {(yi,xi)}i=1n\{(y_i,x_i)\}_{i=1}^n generated by the linear model

yi=xiβ+εi,y_i=x_i^\top \beta^*+\varepsilon_i,

where xiRpnx_i\in\mathbb{R}^{p_n}, βRpn\beta^*\in\mathbb{R}^{p_n} is unknown, and εi\varepsilon_i is independent noise. Let XRn×pnX\in\mathbb{R}^{n\times p_n} be the design matrix with rows xix_i^\top, and Y=(y1,,yn)Y=(y_1,\dots,y_n)^\top. Assume pnnp_n\gg n is allowed, E[xi]=0\mathbb{E}[x_i]=0, Cov(xi)=Σ\mathrm{Cov}(x_i)=\Sigma with eigenvalues bounded away from 00 and \infty, xix_i is sub-Gaussian, and εi\varepsilon_i is mean-zero sub-Gaussian (often Gaussian with variance σ2\sigma^2). Let the parameter space be the sparse class

B0(sn)={βRpn:β0sn}.\mathcal{B}_0(s_n)=\{\beta\in\mathbb{R}^{p_n}:\|\beta\|_0\le s_n\}.

Let β^\hat\beta be the Lasso estimator

β^argminbRpn{12nYXb22+λnb1},\hat\beta\in\arg\min_{b\in\mathbb{R}^{p_n}}\Big\{\frac{1}{2n}\|Y-Xb\|_2^2+\lambda_n\|b\|_1\Big\},

and define the debiased estimator

β^d=β^+1nMX(YXβ^),\hat\beta^d=\hat\beta+\frac{1}{n}MX^\top(Y-X\hat\beta),

where MRpn×pnM\in\mathbb{R}^{p_n\times p_n} is a data-dependent approximate inverse of Σ^=XX/n\hat\Sigma=X^\top X/n (for example, row mjm_j^\top solves minmmΣ^m\min_{m} m^\top\hat\Sigma m subject to Σ^mejμn\|\hat\Sigma m-e_j\|_\infty\le\mu_n). For coordinate jj, define the studentized pivot

Tn,j(β)=n(β^jdβj)τ^j,T_{n,j}(\beta^*)=\frac{\sqrt{n}\,(\hat\beta_j^d-\beta_j^*)}{\hat\tau_j},

with τ^j2\hat\tau_j^2 a consistent estimator of the asymptotic variance (e.g., τ^j2=σ^2mjΣ^mj\hat\tau_j^2=\hat\sigma^2\,m_j^\top\hat\Sigma m_j).

A central unresolved question is to determine sharp conditions on (n,pn,sn)(n,p_n,s_n) (and on design/noise regularity) for regimes not already covered by existing positive results, under which inference based on β^d\hat\beta^d is uniformly valid over the whole sparse class and over all coordinates, namely

supβB0(sn) max1jpn suptRPβ ⁣(Tn,j(β)t)Φ(t)0as n.\sup_{\beta^*\in\mathcal{B}_0(s_n)}\ \max_{1\le j\le p_n}\ \sup_{t\in\mathbb{R}} \left|\mathbb{P}_{\beta^*}\!\left(T_{n,j}(\beta^*)\le t\right)-\Phi(t)\right|\to 0 \quad\text{as }n\to\infty.

Several subcases are known (including 2017-era advances under stronger structural assumptions), but the exact frontier of achievable vs. non-achievable scaling remains open in general.

For fixed α(0,1)\alpha\in(0,1), one related target is uniform coverage of coordinatewise intervals

CIj,α=[β^jdz1α/2τ^jn, β^jd+z1α/2τ^jn]\mathrm{CI}_{j,\alpha}=\Big[\hat\beta_j^d-z_{1-\alpha/2}\frac{\hat\tau_j}{\sqrt n},\ \hat\beta_j^d+z_{1-\alpha/2}\frac{\hat\tau_j}{\sqrt n}\Big]

in the sense

supβB0(sn) max1jpnPβ ⁣(βjCIj,α)(1α)0.\sup_{\beta^*\in\mathcal{B}_0(s_n)}\ \max_{1\le j\le p_n} \left|\mathbb{P}_{\beta^*}\!\left(\beta_j^*\in \mathrm{CI}_{j,\alpha}\right)-(1-\alpha)\right|\to 0.

This fixed-α\alpha coverage criterion is weaker than full Kolmogorov convergence of the pivot law and does not by itself establish the full uniform distributional approximation above.

Unsolved Problem

Identify precise scaling thresholds (involving sns_n, logpn\log p_n, and nn) across unresolved regimes.

§ Discussion

Loading discussion…

§ Significance & Implications

High-dimensional inference is a cornerstone of modern statistics. While debiased/desparsified estimators (Javanmard & Montanari (2014); Geer et al. (2014); Zhang & Zhang (2014)) provide pointwise asymptotic normality, the uniformity question — crucial for honest confidence intervals — is subtle. This connects to impossibility phenomena in post-model-selection inference (Leeb & Pötscher) and to practically used uniform-inference procedures such as post-double-selection.

§ Known Partial Results

  • Cai & Guo (2017) establish rigorous uniform-coverage/minimax results for specific high-dimensional CI targets under structured assumptions; this resolves important subcases rather than the full general regime.

  • Leeb & Pötscher (2006) prove strong non-uniformity/impossibility phenomena for post-model-selection distributional inference, motivating limits on universally honest procedures.

  • Belloni et al. (2014) provide uniformly valid inference for low-dimensional treatment effects via post-double-selection in high-dimensional sparse designs.

§ References

[1]

Confidence intervals and hypothesis testing for high-dimensional regression

Adel Javanmard, Andrea Montanari (2014)

Journal of Machine Learning Research

📍 Section 1.1, paragraph beginning “It is currently an open question whether successful hypothesis testing can be performed under the weaker assumption $s_0=o(n/\log p)$”; JMLR journal pagination p. 2872, corresponding to PDF page 4 in the arXiv/JMLR manuscript layout.

[2]

Confidence intervals for low dimensional parameters in high dimensional linear models

Cun-Hui Zhang, S. S. Zhang (2014)

Journal of the Royal Statistical Society Series B

[3]

On asymptotically optimal confidence regions and tests for high-dimensional models

Sara van de Geer, Peter Bühlmann, Ya'acov Ritov, Ruben Dezeure (2014)

Annals of Statistics

📍 Section 2 (Main results), Theorem 2.1 (desparsified Lasso asymptotic normality), Annals of Statistics 42(3):1166-1202 (journal pagination; theorem appears in the early Section 2 pages).

[4]

Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity

T. Tony Cai, Zijian Guo (2017)

Annals of Statistics

📍 Main theorems on minimax expected length and coverage for confidence intervals in sparse high-dimensional linear regression (published 2017 version; see theorem statements in the main results section).

[5]

Can one estimate the unconditional distribution of post-model-selection estimators?

Hannes Leeb, Benedikt M. Pötscher (2006)

Annals of Statistics

📍 Impossibility/non-uniformity results for post-model-selection distribution estimation (main impossibility theorems).

[6]

Inference on treatment effects after selection among high-dimensional controls

Alexandre Belloni, Victor Chernozhukov, Christian Hansen (2014)

Review of Economic Studies

📍 Post-double-selection construction and uniform validity claims for treatment-effect inference after high-dimensional control selection (main theorem section).

§ Tags