Unsolved

Remove polylogarithmic dimension factors in high-dimensional Berry--Esseen bounds for $m$-dependent sums

Sourced from the work of Heejong Bong, Arun Kumar Kuchibhotla, Alessandro Rinaldo

§ Problem Statement

Setup

Let q>2q>2, m,n,dNm,n,d\in\mathbb N, and let X1,,XnX_1,\dots,X_n be random vectors in Rd\mathbb R^d. Assume:

  1. XiX_i are mean-zero: E[Xi]=0\mathbb E[X_i]=0 for all ii.

  2. X1,,XnX_1,\dots,X_n are mm-dependent, meaning that for every k{1,,n}k\in\{1,\dots,n\}, the sigma-fields σ(Xi:ik)\sigma(X_i:i\le k) and σ(Xj:jk+m+1)\sigma(X_j:j\ge k+m+1) are independent.

  3. A uniform qq-moment bound holds: for some finite constant MqM_q, max1inmax1jdEXijqMqq\max_{1\le i\le n}\max_{1\le j\le d}\mathbb E|X_{ij}|^q\le M_q^q.

  4. For

Sn:=1ni=1nXi,Σn:=Cov(Sn),S_n:=\frac{1}{\sqrt n}\sum_{i=1}^n X_i,\qquad \Sigma_n:=\operatorname{Cov}(S_n),

Σn\Sigma_n is nondegenerate in coordinates, e.g. min1jd(Σn)jjσ2\min_{1\le j\le d}(\Sigma_n)_{jj}\ge \underline{\sigma}^2 for some σ>0\underline{\sigma}>0.

Let ZN(0,Σn)Z\sim N(0,\Sigma_n), and let Hd\mathcal H_d be the class of axis-aligned hyper-rectangles

Hd:={j=1d(aj,bj]: ajbj}.\mathcal H_d:=\left\{\prod_{j=1}^d (a_j,b_j]:\ -\infty\le a_j\le b_j\le \infty\right\}.

Define the rectangle Kolmogorov distance

Δn,d:=supAHdP(SnA)P(ZA).\Delta_{n,d}:=\sup_{A\in\mathcal H_d}\left|\mathbb P(S_n\in A)-\mathbb P(Z\in A)\right|.

Unsolved Problem

Under the assumptions above, does there exist a constant CC depending only on fixed model parameters (for example only on qq, MqM_q, and σ\underline{\sigma}), but independent of d,n,md,n,m, such that for all d,n,md,n,m and all such mm-dependent arrays,

Δn,dCm(q1)/(q2)n,\Delta_{n,d}\le C\,\frac{m^{(q-1)/(q-2)}}{\sqrt n},

that is, with no additional multiplicative polylog(d)\operatorname{polylog}(d) factor?

§ Discussion

Loading discussion…

§ Significance & Implications

Bong, Kuchibhotla, and Rinaldo’s arXiv record is currently 2306.14299v3 (latest arXiv version; revised 2025-08-29), and the work is listed as accepted at Annals of Statistics (2025). Their bounds are sharp in nn and mm up to logarithmic factors in dd; removing (or proving unavoidable) these dimension-log factors would clarify optimal high-dimensional Gaussian approximation rates under mm-dependence.

§ Known Partial Results

  • Bong et al. (2025): This paper proves sharp high-dimensional bounds with only polylogarithmic dependence on dd and optimal mm/nn scaling m(q1)/(q2)/nm^{(q-1)/(q-2)}/\sqrt n. In univariate settings, matching optimal rates are known (up to logs as stated in the abstract).

§ References

[1]

Dual Induction CLT for High-dimensional m-dependent Data

Heejong Bong, Arun Kumar Kuchibhotla, Alessandro Rinaldo (2025)

Annals of Statistics (accepted, 2025)

📍 Section 3 (Discussion), item 2, p. 14 (arXiv v3 manuscript).

Source paper where this problem appears; latest arXiv version is v3 (revised 2025-08-29).

§ Tags