Critical SNR for outlier emergence at fixed summary statistics
Sourced from the work of Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath
§ Problem Statement
Setup
Fix integers , a mixture-weight vector with and , a class map , and an aspect ratio . For with , consider i.i.d. samples generated by
where is the signal-to-noise ratio (inverse covariance scale). Let be the parameter matrix for multiclass logistic regression with loss
For each class index , define the empirical Hessian and information (gradient-second-moment) blocks
where . Set . Let and define the summary statistic (Gram) matrix
Call feasible if and it is realizable as such a Gram matrix for some and vectors . Fix feasible and . Write , , and let with . Define
As in the source model, one may keep an explicit scale parameter ; the formulas below are the specialization used here. For , define as the Stieltjes transform solving
and let be the corresponding effective bulk measure, with right edge . For , define
with . Define effective outliers by
counting roots with multiplicity, and define effective right-outliers
This setup follows Arous et al. (2025).
Unsolved Problem
For fixed and fixed feasible , determine whether there is a sharp critical value for outlier emergence.
A proved high-SNR existence direction in the source is: for each fixed feasible pair there exists such that effective outliers are present for all . (This entry uses the specialization.) Hence, in the present notation, one has the proved direction
The unproved complementary direction is a low-SNR exclusion theorem:
A full resolution would identify whether these two regimes meet at a single threshold (and under what additional assumptions).
§ Discussion
§ Significance & Implications
This would turn the paper’s qualitative spectral-transition picture into a sharp phase diagram indexed by . It would also connect geometry of the loss landscape to a single order parameter controlling when informative low-dimensional directions appear. See Arous et al. (2025) for details.
§ Known Partial Results
Arous et al. (2025): Corollary 1.8 in Ben Arous et al. (2025, arXiv:2502.15655v3) proves large- outlier emergence (for each fixed in the source model), and the paper also provides fixed-point characterizations of bulk and outliers. The complementary small- no-outlier direction and a single sharp critical threshold remain unproved. Open as of arXiv v3 (January 22, 2026).
§ References
Gerard Ben Arous, Reza Gheissari, Jiaoyang Huang, Aukosh Jagannath (2025)
Annals of Statistics (to appear)
📍 Section 1.3.2 ("The effective spectrum at initialization and along the training trajectory"), Corollary 1.8 (large-$\beta$ outlier existence for fixed $(\lambda,G)$) and the immediately following paragraph (explicitly noting the missing complementary small-$\beta$ no-outlier result) in arXiv:2502.15655v3.
Source paper where this problem appears.