Partially Resolved

Exploit structure of the overlap matrix beyond trace/spectral gap

Sourced from the work of Yanjun Han, Jonathan Niles-Weed

§ Problem Statement

Setup

Let (X,F)(\mathcal X,\mathcal F) be a measurable space, let n2n\ge 2, and let P1,,PnP_1,\dots,P_n be probability measures on (X,F)(\mathcal X,\mathcal F). Define their average measure

Pˉ:=1ni=1nPi,\bar P:=\frac1n\sum_{i=1}^n P_i,

and assume PiPˉP_i\ll \bar P for every ii (so all Radon-Nikodym derivatives fi:=dPi/dPˉf_i:=dP_i/d\bar P are well-defined). Define the exchangeable permutation-mixture law Pn\mathbb P_n on (Xn,Fn)(\mathcal X^n,\mathcal F^{\otimes n}) by

Pn:=1n!πSnk=1nPπ(k),\mathbb P_n:=\frac1{n!}\sum_{\pi\in S_n}\bigotimes_{k=1}^n P_{\pi(k)},

where SnS_n is the symmetric group on {1,,n}\{1,\dots,n\}. Define its i.i.d. counterpart

Qn:=Pˉn.\mathbb Q_n:=\bar P^{\otimes n}.

Assume PnQn\mathbb P_n\ll \mathbb Q_n and define

χ2(PnQn):=(dPndQn1)2dQn.\chi^2(\mathbb P_n\|\mathbb Q_n):=\int\left(\frac{d\mathbb P_n}{d\mathbb Q_n}-1\right)^2\,d\mathbb Q_n.

Introduce the overlap matrix ARn×nA\in\mathbb R^{n\times n} by

Aij:=1nfifjdPˉ=1ndPidPjdPˉ,1i,jn.A_{ij}:=\frac1n\int f_i f_j\,d\bar P=\frac1n\int \frac{dP_i\,dP_j}{d\bar P},\qquad 1\le i,j\le n.

Then AA is symmetric, positive semidefinite, and has row/column sums equal to 11.

This setup follows Han & Niles-Weed (2024).

Unsolved Problem

Find structural assumptions on AA that are strictly finer than global spectral summaries (for example, tr(A)\operatorname{tr}(A) or the second-largest eigenvalue λ2(A)\lambda_2(A)) and that imply sharper finite-nn upper bounds on χ2(PnQn)\chi^2(\mathbb P_n\|\mathbb Q_n). In particular, characterize classes of matrices AA (equivalently, classes of families {Pi}i=1n\{P_i\}_{i=1}^n) for which one can prove explicit nonasymptotic rates

χ2(PnQn)Ψn(A),\chi^2(\mathbb P_n\|\mathbb Q_n)\le \Psi_n(A),

with Ψn(A)\Psi_n(A) depending on refined structure and improving provably over bounds that depend only on tr(A)\operatorname{tr}(A) and/or λ2(A)\lambda_2(A). Proposed directions (illustrative, unless directly established in cited sources) include entrywise heterogeneity, sparsity, block structure, and other higher-order invariants.

§ Discussion

Loading discussion…

§ Significance & Implications

In Han & Niles-Weed (2024), the current general bounds are broad but may be conservative for structured families. Subsequent follow-up work in September 2025 indicates at least partial progress on structure-sensitive refinements, while a complete characterization remains open.

§ Known Partial Results

  • Follow-up on permutation-mixture overlap-structure bounds (2025): The 2025 arXiv v3 paper develops a general (\chi^2)-control framework via symmetric-polynomial and permanent inequalities and shows model-agnostic bounds. A September 2025 follow-up (arXiv:2509.12584) provides at least partial progress toward structure-sensitive improvements, but the full matrix-structural characterization remains open.

§ References

[1]

Approximate independence of permutation mixtures

Yanjun Han, Jonathan Niles-Weed (2025)

Annals of Statistics (to appear)

📍 Section 6.1 (Discussion: Tightness of upper bounds), paragraph immediately after Lemma 6.2 ("for specific $P$, further properties of the matrix $A$...") on p. 26 (arXiv v3; Section 5.1 in earlier numbering).

Primary source paper where this problem is articulated.

[2]

Follow-up on permutation-mixture overlap-structure bounds

arXiv preprint

📍 Follow-up reference (arXiv:2509.12584), used for status qualification rather than a full closure claim.

Post-September 2025 follow-up reference added to document partial progress and avoid stale all-open interpretation.

§ Tags