Unsolved

Monotonicity of Learning

Posed by Tom Viering et al. (2019)

§ Problem Statement

Setup

Let (Z,Z)(Z,\mathcal{Z}) be a measurable space and let DD be an unknown probability distribution on ZZ. For each nNn\in\mathbb{N}, let Sn=(z1,,zn)DnS_n=(z_1,\dots,z_n)\sim D^n be an i.i.d. sample. Fix a hypothesis class HH and a measurable loss L:H×ZRL:H\times Z\to\mathbb{R}. Define the population (true) risk LD(h)=EzD[L(h,z)]L_D(h)=\mathbb{E}_{z\sim D}[L(h,z)].

An empirical risk minimization (ERM) learner is a deterministic rule AA such that for every sample SnS_n,

A(Sn)argminhH  1ni=1nL(h,zi),A(S_n)\in\arg\min_{h\in H}\;\frac{1}{n}\sum_{i=1}^n L(h,z_i),

with a fixed tie-breaking rule so that A(Sn)A(S_n) is single-valued.

Define the expected population risk at sample size nn by

Rn(D):=ESnDn[LD(A(Sn))].R_n(D):=\mathbb{E}_{S_n\sim D^n}\big[L_D(A(S_n))\big].

(Here the expectations for nn and n+1n+1 are over fresh i.i.d. samples from DnD^n and Dn+1D^{n+1}, respectively.)

The learner AA is (locally) risk-monotone at (D,n)(D,n) if Rn+1(D)Rn(D)R_{n+1}(D)\le R_n(D). Call AA ZZ-monotone if this inequality holds for all nNn\in\mathbb{N} and all distributions DD on ZZ. Call AA weakly ZZ-monotone if there exists an integer NN (independent of DD) such that Rn+1(D)Rn(D)R_{n+1}(D)\le R_n(D) holds for all nNn\ge N and all distributions DD on ZZ.

Unsolved Problem

For ERM learners, determine conditions on (Z,H,L)(Z,H,L) and/or on restrictions on DD (e.g., well-specification/realizability assumptions) that are sufficient and/or necessary for ZZ-monotonicity or weak ZZ-monotonicity.

In particular, resolve ZZ-monotonicity and weak ZZ-monotonicity (existence of such an NN, and if it exists, the smallest valid NN) for the following ERM setups on bounded domains:

  1. Normal variance estimation: Z[1,1]Z\subseteq[-1,1], H={N(0,σ2):σ>0}H=\{\mathcal{N}(0,\sigma^2):\sigma>0\}, and LL is negative log-likelihood under N(0,σ2)\mathcal{N}(0,\sigma^2); ERM corresponds to maximum likelihood with σ^2=1ni=1nzi2\hat\sigma^2=\frac{1}{n}\sum_{i=1}^n z_i^2.

  2. Univariate linear regression: Z=X×YZ=X\times Y with X[1,1]X\subseteq[-1,1] and Y[0,1]Y\subseteq[0,1], H={hw(x)=wx:wR}H=\{h_w(x)=wx:w\in\mathbb{R}\}, and L(hw,(x,y))=(wxy)2L(h_w,(x,y))=(wx-y)^2; ERM minimizes empirical mean squared error over ww.

§ Discussion

Loading discussion…

§ Significance & Implications

This problem asks for distribution-free (or assumption-qualified) guarantees that the expected population risk of ERM does not increase when the sample size grows by one. Such a guarantee is strictly finer than standard generalization or PAC-style statements because it constrains the entire learning curve nRn(D)n\mapsto R_n(D), not just asymptotic consistency or rates. Pinning down when monotonicity holds (or must fail) clarifies when non-monotone learning curves are an unavoidable consequence of ERM + finite samples versus an artifact of modeling choices (loss, hypothesis class, tie-breaking) or missing distributional assumptions.

§ Known Partial Results

  • Viering et al. (2019): formalize expected risk monotonicity for learning algorithms, including the distribution-free notions of ZZ-monotonicity and eventual (weak) ZZ-monotonicity for ERM.

  • Viering et al. (2019): The same note provides examples showing that risk monotonicity can hold in some ERM settings and fail in others, even when monotonicity is evaluated in expectation over the random training sample.

  • Viering et al. (2019): The note relates risk monotonicity to PAC-learnability, emphasizing that standard learnability/generalization guarantees (which improve with nn in typical bounds) do not automatically imply that Rn+1(D)Rn(D)R_{n+1}(D)\le R_n(D) for every nn and every DD.

§ References

[1]

Open Problem: Monotonicity of Learning

Tom Viering, Alexander Mey, Marco Loog (2019)

Conference on Learning Theory (COLT), PMLR 99

📍 Open-problem note in COLT proceedings.

[2]

Open Problem: Monotonicity of Learning (PDF)

Tom Viering, Alexander Mey, Marco Loog (2019)

Conference on Learning Theory (COLT), PMLR 99

📍 Proceedings PDF.

§ Tags