Unsolved

Critical SNR for outlier emergence at fixed summary statistics

Sourced from the work of Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath

§ Problem Statement

Setup

Fix integers k,C1k,C\ge 1, a mixture-weight vector p=(p1,,pk)p=(p_1,\dots,p_k) with pb>0p_b>0 and b=1kpb=1\sum_{b=1}^k p_b=1, a class map c:[k][C]c:[k]\to[C], and an aspect ratio ϕ(0,)\phi\in(0,\infty). For d,nd,n\to\infty with n/dϕn/d\to\phi, consider i.i.d. samples (y,Y)=1n(y_\ell,Y_\ell)_{\ell=1}^n generated by

JCat(p),Y(J=b)N(μb,β1Id),y=c(J)[C],J_\ell\sim \mathrm{Cat}(p),\qquad Y_\ell\mid(J_\ell=b)\sim \mathcal N(\mu_b,\beta^{-1}I_d),\qquad y_\ell=c(J_\ell)\in[C],

where β>0\beta>0 is the signal-to-noise ratio (inverse covariance scale). Let X=(x1,,xC)Rd×CX=(x_1,\dots,x_C)\in\mathbb R^{d\times C} be the parameter matrix for multiclass logistic regression with loss

(X;(y,Y))=a=1C1{y=a}xaY+log ⁣(a=1CexaY).\ell(X;(y,Y))=-\sum_{a=1}^C \mathbf 1_{\{y=a\}}\,x_a^\top Y+\log\!\Big(\sum_{a=1}^C e^{x_a^\top Y}\Big).

For each class index α[C]\alpha\in[C], define the empirical Hessian and information (gradient-second-moment) blocks

H^αα(X)=1n=1nϕα(XY)(1ϕα(XY))YY,\widehat H_{\alpha\alpha}(X)=\frac1n\sum_{\ell=1}^n \phi_\alpha(X^\top Y_\ell)\bigl(1-\phi_\alpha(X^\top Y_\ell)\bigr)\,Y_\ell Y_\ell^\top, I^αα(X)=1n=1n(1{y=α}ϕα(XY))2YY,\widehat I_{\alpha\alpha}(X)=\frac1n\sum_{\ell=1}^n \bigl(\mathbf 1_{\{y_\ell=\alpha\}}-\phi_\alpha(X^\top Y_\ell)\bigr)^2\,Y_\ell Y_\ell^\top,

where ϕα(z)=ezα/a=1Ceza\phi_\alpha(z)=e^{z_\alpha}/\sum_{a=1}^C e^{z_a}. Set q:=C+kq:=C+k. Let M=(μ1,,μk)Rd×kM=(\mu_1,\dots,\mu_k)\in\mathbb R^{d\times k} and define the summary statistic (Gram) matrix

G=(X,M)(X,M)Rq×q.G=(X,M)^\top(X,M)\in\mathbb R^{q\times q}.

Call GG feasible if G0G\succeq 0 and it is realizable as such a Gram matrix for some dd and vectors (x1,,xC,μ1,,μk)(x_1,\dots,x_C,\mu_1,\dots,\mu_k). Fix feasible GG and α[C]\alpha\in[C]. Write mb(G)=G[C],C+bRCm_b(G)=G_{[C],\,C+b}\in\mathbb R^C, A(G)=G[C],[C]1/2A(G)=G_{[C],[C]}^{1/2}, and let gN(0,Iq)g\sim\mathcal N(0,I_q) with g[C]RCg_{[C]}\in\mathbb R^C. Define

fb,αH(u;G)=ϕα(mb(G)+A(G)u)(1ϕα(mb(G)+A(G)u)),f^{H}_{b,\alpha}(u;G)=\phi_\alpha(m_b(G)+A(G)u)\bigl(1-\phi_\alpha(m_b(G)+A(G)u)\bigr), fb,αI(u;G)=(1{c(b)=α}ϕα(mb(G)+A(G)u))2.f^{I}_{b,\alpha}(u;G)=\bigl(\mathbf 1_{\{c(b)=\alpha\}}-\phi_\alpha(m_b(G)+A(G)u)\bigr)^2.

As in the source model, one may keep an explicit scale parameter λ>0\lambda>0; the formulas below are the λ=1\lambda=1 specialization used here. For T{H,I}T\in\{H,I\}, define SαT(z)S^T_{\alpha}(z) as the Stieltjes transform solving

1+zS(z)=ϕb=1kpbE ⁣[S(z)fb,αT(β1/2g[C];G)βϕ+S(z)fb,αT(β1/2g[C];G)],1+zS(z)=\phi\sum_{b=1}^k p_b\,\mathbb E\!\left[\frac{S(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}{\beta\phi+S(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}\right],

and let να,β,GT\nu^T_{\alpha,\beta,G} be the corresponding effective bulk measure, with right edge rαT(β,G)=supsupp(να,β,GT)r^T_{\alpha}(\beta,G)=\sup\operatorname{supp}(\nu^T_{\alpha,\beta,G}). For zCsupp(να,β,GT)z\in\mathbb C\setminus\operatorname{supp}(\nu^T_{\alpha,\beta,G}), define

Fα,ijT(z;β,G)=βϕb=1kpbE ⁣[fb,αT(β1/2g[C];G)βϕ+SαT(z)fb,αT(β1/2g[C];G)vb,ivb,j],F^{T}_{\alpha,ij}(z;\beta,G)=\beta\phi\sum_{b=1}^k p_b\,\mathbb E\!\left[ \frac{f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)}{\beta\phi+S^T_{\alpha}(z)\,f^{T}_{b,\alpha}(\beta^{-1/2}g_{[C]};G)} \,v_{b,i}\,v_{b,j}\right],

with vb=β1/2g+(G)eC+bRqv_b=\beta^{-1/2}g+(\sqrt G)e_{C+b}\in\mathbb R^q. Define effective outliers by

ZαT(β,G)={zRsupp(να,β,GT):det ⁣(zIqFαT(z;β,G))=0},\mathcal Z^T_{\alpha}(\beta,G)=\Big\{z\in\mathbb R\setminus\operatorname{supp}(\nu^T_{\alpha,\beta,G}):\det\!\big(zI_q-F^T_{\alpha}(z;\beta,G)\big)=0\Big\},

counting roots with multiplicity, and define effective right-outliers

OαT(β,G)=ZαT(β,G)(rαT(β,G),),OT(β,G)=α=1COαT(β,G).\mathcal O^T_{\alpha}(\beta,G)=\mathcal Z^T_{\alpha}(\beta,G)\cap\big(r^T_{\alpha}(\beta,G),\infty\big),\qquad \mathcal O_T(\beta,G)=\bigcup_{\alpha=1}^C \mathcal O^T_{\alpha}(\beta,G).

This setup follows Arous et al. (2025).

Unsolved Problem

For fixed ϕ\phi and fixed feasible GG, determine whether there is a sharp critical value βc(G)[0,]\beta_c(G)\in[0,\infty] for outlier emergence.

A proved high-SNR existence direction in the source is: for each fixed feasible pair (λ,G)(\lambda,G) there exists β0(λ,G)<\beta_0(\lambda,G)<\infty such that effective outliers are present for all β>β0(λ,G)\beta>\beta_0(\lambda,G). (This entry uses the λ=1\lambda=1 specialization.) Hence, in the present notation, one has the proved direction

β0(G)< such that β>β0(G)OH(β,G)OI(β,G).\exists\,\beta_0(G)<\infty\ \text{such that}\ \beta>\beta_0(G)\Rightarrow \mathcal O_H(\beta,G)\cup\mathcal O_I(\beta,G)\neq\varnothing.

The unproved complementary direction is a low-SNR exclusion theorem:

β1(G)>0 such that β<β1(G)OH(β,G)=OI(β,G)=.\exists\,\beta_1(G)>0\ \text{such that}\ \beta<\beta_1(G)\Rightarrow \mathcal O_H(\beta,G)=\mathcal O_I(\beta,G)=\varnothing.

A full resolution would identify whether these two regimes meet at a single threshold βc(G)\beta_c(G) (and under what additional assumptions).

§ Discussion

Loading discussion…

§ Significance & Implications

This would turn the paper’s qualitative spectral-transition picture into a sharp phase diagram indexed by (β,G)(\beta,\mathbf G). It would also connect geometry of the loss landscape to a single order parameter controlling when informative low-dimensional directions appear. See Arous et al. (2025) for details.

§ Known Partial Results

  • Arous et al. (2025): Corollary 1.8 in Ben Arous et al. (2025, arXiv:2502.15655v3) proves large-β\beta outlier emergence (for each fixed (λ,G)(\lambda,G) in the source model), and the paper also provides fixed-point characterizations of bulk and outliers. The complementary small-β\beta no-outlier direction and a single sharp critical threshold remain unproved. Open as of arXiv v3 (January 22, 2026).

§ References

[1]

Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions

Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath (2025)

Annals of Statistics (to appear)

📍 Section 1.3.2 ("The effective spectrum at initialization and along the training trajectory"), Corollary 1.8 (large-$\beta$ outlier existence for fixed $(\lambda,G)$) and the immediately following paragraph (explicitly noting the missing complementary small-$\beta$ no-outlier result) in arXiv:2502.15655v3.

Source paper where this problem appears.

§ Tags