Uniform Inference for High-Dimensional Linear Models

§ Problem Statement

Setup

For each sample size $n$ , observe i.i.d. pairs $\{(y_i,x_i)\}_{i=1}^n$ generated by the linear model

y_i=x_i^\top \beta^*+\varepsilon_i,

where $x_i\in\mathbb{R}^{p_n}$ , $\beta^*\in\mathbb{R}^{p_n}$ is unknown, and $\varepsilon_i$ is independent noise. Let $X\in\mathbb{R}^{n\times p_n}$ be the design matrix with rows $x_i^\top$ , and $Y=(y_1,\dots,y_n)^\top$ . Assume $p_n\gg n$ is allowed, $\mathbb{E}[x_i]=0$ , $\mathrm{Cov}(x_i)=\Sigma$ with eigenvalues bounded away from $0$ and $\infty$ , $x_i$ is sub-Gaussian, and $\varepsilon_i$ is mean-zero sub-Gaussian (often Gaussian with variance $\sigma^2$ ). Let the parameter space be the sparse class

\mathcal{B}_0(s_n)=\{\beta\in\mathbb{R}^{p_n}:\|\beta\|_0\le s_n\}.

Let $\hat\beta$ be the Lasso estimator

\hat\beta\in\arg\min_{b\in\mathbb{R}^{p_n}}\Big\{\frac{1}{2n}\|Y-Xb\|_2^2+\lambda_n\|b\|_1\Big\},

and define the debiased estimator

\hat\beta^d=\hat\beta+\frac{1}{n}MX^\top(Y-X\hat\beta),

where $M\in\mathbb{R}^{p_n\times p_n}$ is a data-dependent approximate inverse of $\hat\Sigma=X^\top X/n$ (for example, row $m_j^\top$ solves $\min_{m} m^\top\hat\Sigma m$ subject to $\|\hat\Sigma m-e_j\|_\infty\le\mu_n$ ). For coordinate $j$ , define the studentized pivot

T_{n,j}(\beta^*)=\frac{\sqrt{n}\,(\hat\beta_j^d-\beta_j^*)}{\hat\tau_j},

with $\hat\tau_j^2$ a consistent estimator of the asymptotic variance (e.g., $\hat\tau_j^2=\hat\sigma^2\,m_j^\top\hat\Sigma m_j$ ).

A central unresolved question is to determine sharp conditions on $(n,p_n,s_n)$ (and on design/noise regularity) for regimes not already covered by existing positive results, under which inference based on $\hat\beta^d$ is uniformly valid over the whole sparse class and over all coordinates, namely

\sup_{\beta^*\in\mathcal{B}_0(s_n)}\ \max_{1\le j\le p_n}\ \sup_{t\in\mathbb{R}} \left|\mathbb{P}_{\beta^*}\!\left(T_{n,j}(\beta^*)\le t\right)-\Phi(t)\right|\to 0 \quad\text{as }n\to\infty.

Several subcases are known (including 2017-era advances under stronger structural assumptions), but the exact frontier of achievable vs. non-achievable scaling remains open in general.

For fixed $\alpha\in(0,1)$ , one related target is uniform coverage of coordinatewise intervals

\mathrm{CI}_{j,\alpha}=\Big[\hat\beta_j^d-z_{1-\alpha/2}\frac{\hat\tau_j}{\sqrt n},\ \hat\beta_j^d+z_{1-\alpha/2}\frac{\hat\tau_j}{\sqrt n}\Big]

in the sense

\sup_{\beta^*\in\mathcal{B}_0(s_n)}\ \max_{1\le j\le p_n} \left|\mathbb{P}_{\beta^*}\!\left(\beta_j^*\in \mathrm{CI}_{j,\alpha}\right)-(1-\alpha)\right|\to 0.

This fixed- $\alpha$ coverage criterion is weaker than full Kolmogorov convergence of the pivot law and does not by itself establish the full uniform distributional approximation above.

Unsolved Problem

Identify precise scaling thresholds (involving $s_n$ , $\log p_n$ , and $n$ ) across unresolved regimes.

§ Discussion

Loading discussion…

§ Significance & Implications

High-dimensional inference is a cornerstone of modern statistics. While debiased/desparsified estimators (Javanmard & Montanari (2014); Geer et al. (2014); Zhang & Zhang (2014)) provide pointwise asymptotic normality, the uniformity question — crucial for honest confidence intervals — is subtle. This connects to impossibility phenomena in post-model-selection inference (Leeb & Pötscher) and to practically used uniform-inference procedures such as post-double-selection.

§ Known Partial Results

Javanmard et al. (2014): ](#references); Javanmard & Montanari (2014)).
Cai & Guo (2017) establish rigorous uniform-coverage/minimax results for specific high-dimensional CI targets under structured assumptions; this resolves important subcases rather than the full general regime.
Leeb & Pötscher (2006) prove strong non-uniformity/impossibility phenomena for post-model-selection distributional inference, motivating limits on universally honest procedures.
Belloni et al. (2014) provide uniformly valid inference for low-dimensional treatment effects via post-double-selection in high-dimensional sparse designs.

§ References

[1]

Confidence intervals and hypothesis testing for high-dimensional regression

Adel Javanmard, Andrea Montanari (2014)

Journal of Machine Learning Research

📍 Section 1.1, paragraph beginning “It is currently an open question whether successful hypothesis testing can be performed under the weaker assumption $s_0=o(n/\log p)$”; JMLR journal pagination p. 2872, corresponding to PDF page 4 in the arXiv/JMLR manuscript layout.

arXiv ↗

[2]

Confidence intervals for low dimensional parameters in high dimensional linear models

Cun-Hui Zhang, S. S. Zhang (2014)

Journal of the Royal Statistical Society Series B

DOI ↗

[3]

On asymptotically optimal confidence regions and tests for high-dimensional models

Sara van de Geer, Peter Bühlmann, Ya'acov Ritov, Ruben Dezeure (2014)

Annals of Statistics

📍 Section 2 (Main results), Theorem 2.1 (desparsified Lasso asymptotic normality), Annals of Statistics 42(3):1166-1202 (journal pagination; theorem appears in the early Section 2 pages).

DOI ↗

[4]

Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity

T. Tony Cai, Zijian Guo (2017)

Annals of Statistics

📍 Main theorems on minimax expected length and coverage for confidence intervals in sparse high-dimensional linear regression (published 2017 version; see theorem statements in the main results section).

DOI ↗arXiv ↗

[5]

Can one estimate the unconditional distribution of post-model-selection estimators?

Hannes Leeb, Benedikt M. Pötscher (2006)

Annals of Statistics

📍 Impossibility/non-uniformity results for post-model-selection distribution estimation (main impossibility theorems).

DOI ↗

[6]

Inference on treatment effects after selection among high-dimensional controls

Alexandre Belloni, Victor Chernozhukov, Christian Hansen (2014)

Review of Economic Studies

📍 Post-double-selection construction and uniform validity claims for treatment-effect inference after high-dimensional control selection (main theorem section).

DOI ↗arXiv ↗

§ Tags

high-dimensional debiased-Lasso inference confidence-intervals uniform-validity