Critical SNR for outlier emergence at fixed summary statistics

§ Problem Statement

Setup

Fix integers $k,C\ge 1$ , a mixture-weight vector $p=(p_1,\dots,p_k)$ with $p_b>0$ and $\sum_{b=1}^k p_b=1$ , a class map $c:[k]\to[C]$ , and an aspect ratio $\phi\in(0,\infty)$ . For $d,n\to\infty$ with $n/d\to\phi$ , consider i.i.d. samples $(y_\ell,Y_\ell)_{\ell=1}^n$ generated by

J_\ell\sim \mathrm{Cat}(p),\qquad Y_\ell\mid(J_\ell=b)\sim \mathcal N(\mu_b,\beta^{-1}I_d),\qquad y_\ell=c(J_\ell)\in[C],

where $\beta>0$ is the signal-to-noise ratio (inverse covariance scale). Let $X=(x_1,\dots,x_C)\in\mathbb R^{d\times C}$ be the parameter matrix for multiclass logistic regression with loss

\ell(X;(y,Y))=-\sum_{a=1}^C \mathbf 1_{\{y=a\}}\,x_a^\top Y+\log\!\Big(\sum_{a=1}^C e^{x_a^\top Y}\Big).

For each class index $\alpha\in[C]$ , define the empirical Hessian and information (gradient-second-moment) blocks

\widehat H_{\alpha\alpha}(X)=\frac1n\sum_{\ell=1}^n \phi_\alpha(X^\top Y_\ell)\bigl(1-\phi_\alpha(X^\top Y_\ell)\bigr)\,Y_\ell Y_\ell^\top,

\widehat I_{\alpha\alpha}(X)=\frac1n\sum_{\ell=1}^n \bigl(\mathbf 1_{\{y_\ell=\alpha\}}-\phi_\alpha(X^\top Y_\ell)\bigr)^2\,Y_\ell Y_\ell^\top,

where $\phi_\alpha(z)=e^{z_\alpha}/\sum_{a=1}^C e^{z_a}$ . Set $q:=C+k$ . Let $M=(\mu_1,\dots,\mu_k)\in\mathbb R^{d\times k}$ and define the summary statistic (Gram) matrix

G=(X,M)^\top(X,M)\in\mathbb R^{q\times q}.

Call $G$ feasible if $G\succeq 0$ and it is realizable as such a Gram matrix for some $d$ and vectors $(x_1,\dots,x_C,\mu_1,\dots,\mu_k)$ . Fix feasible $G$ and $\alpha\in[C]$ . Write $m_b(G)=G_{[C],\,C+b}\in\mathbb R^C$ , $A(G)=G_{[C],[C]}^{1/2}$ , and let $g\sim\mathcal N(0,I_q)$ with $g_{[C]}\in\mathbb R^C$ . Define

f^{H}_{b,\alpha}(u;G)=\phi_\alpha(m_b(G)+A(G)u)\bigl(1-\phi_\alpha(m_b(G)+A(G)u)\bigr),

f^{I}_{b,\alpha}(u;G)=\bigl(\mathbf 1_{\{c(b)=\alpha\}}-\phi_\alpha(m_b(G)+A(G)u)\bigr)^2.

As in the source model, one may keep an explicit scale parameter $\lambda>0$ ; the formulas below are the $\lambda=1$ specialization used here. For $T\in\{H,I\}$ , define $S^T_{\alpha}(z)$ as the Stieltjes transform solving

1+zS(z)=\phi\sum_{b=1}^k p_b\,\mathbb E\!\left[\frac{S(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}{\beta\phi+S(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}\right],

and let $\nu^T_{\alpha,\beta,G}$ be the corresponding effective bulk measure, with right edge $r^T_{\alpha}(\beta,G)=\sup\operatorname{supp}(\nu^T_{\alpha,\beta,G})$ . For $z\in\mathbb C\setminus\operatorname{supp}(\nu^T_{\alpha,\beta,G})$ , define

F^{T}_{\alpha,ij}(z;\beta,G)=\beta\phi\sum_{b=1}^k p_b\,\mathbb E\!\left[ \frac{f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}{\beta\phi+S^T_{\alpha}(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)} \,v_{b,i}\,v_{b,j}\right],

with $v_b=\beta^{-1/2}g+(\sqrt G)e_{C+b}\in\mathbb R^q$ . Define effective outliers by

\mathcal Z^T_{\alpha}(\beta,G)=\Big\{z\in\mathbb R\setminus\operatorname{supp}(\nu^T_{\alpha,\beta,G}):\det\!\big(zI_q-F^T_{\alpha}(z;\beta,G)\big)=0\Big\},

counting roots with multiplicity, and define effective right-outliers

\mathcal O^T_{\alpha}(\beta,G)=\mathcal Z^T_{\alpha}(\beta,G)\cap\big(r^T_{\alpha}(\beta,G),\infty\big),\qquad \mathcal O_T(\beta,G)=\bigcup_{\alpha=1}^C \mathcal O^T_{\alpha}(\beta,G).

This setup follows Arous et al. (2025).

Unsolved Problem

For fixed $\phi$ and fixed feasible $G$ , determine whether there is a sharp critical value $\beta_c(G)\in[0,\infty]$ for outlier emergence.

A proved high-SNR existence direction in the source is: for each fixed feasible pair $(\lambda,G)$ there exists $\beta_0(\lambda,G)<\infty$ such that effective outliers are present for all $\beta>\beta_0(\lambda,G)$ . (This entry uses the $\lambda=1$ specialization.) Hence, in the present notation, one has the proved direction

\exists\,\beta_0(G)<\infty\ \text{such that}\ \beta>\beta_0(G)\Rightarrow \mathcal O_H(\beta,G)\cup\mathcal O_I(\beta,G)\neq\varnothing.

The unproved complementary direction is a low-SNR exclusion theorem:

\exists\,\beta_1(G)>0\ \text{such that}\ \beta<\beta_1(G)\Rightarrow \mathcal O_H(\beta,G)=\mathcal O_I(\beta,G)=\varnothing.

A full resolution would identify whether these two regimes meet at a single threshold $\beta_c(G)$ (and under what additional assumptions).

§ Discussion

Loading discussion…

§ Significance & Implications

This would turn the paper’s qualitative spectral-transition picture into a sharp phase diagram indexed by $(\beta,\mathbf G)$ . It would also connect geometry of the loss landscape to a single order parameter controlling when informative low-dimensional directions appear. See Arous et al. (2025) for details.

§ Known Partial Results

Arous et al. (2025): Corollary 1.8 in Ben Arous et al. (2025, arXiv:2502.15655v3) proves large- $\beta$ outlier emergence (for each fixed $(\lambda,G)$ in the source model), and the paper also provides fixed-point characterizations of bulk and outliers. The complementary small- $\beta$ no-outlier direction and a single sharp critical threshold remain unproved. Open as of arXiv v3 (January 22, 2026).

§ References

[1]

Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions

Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath (2025)

Annals of Statistics (to appear)

📍 Section 1.3.2 ("The effective spectrum at initialization and along the training trajectory"), Corollary 1.8 (large-$\beta$ outlier existence for fixed $(\lambda,G)$) and the immediately following paragraph (explicitly noting the missing complementary small-$\beta$ no-outlier result) in arXiv:2502.15655v3.

Source paper where this problem appears.

Link ↗arXiv ↗

§ Tags

spectral-transition outliers high-dimensional-asymptotics gaussian-mixtures hessian fisher-information