Can Local Regularization Learn All Multiclass Problems?
Posed by Julian Asilis et al. (2024)
§ Problem Statement
Setup
Fix a domain , a label set (possibly infinite), and a hypothesis class . For a distribution over and a predictor , define the - risk . For a sample , define the empirical risk and the set of sample-consistent hypotheses . Call realizable w.r.t. if there exists with .
A (possibly improper) learning algorithm maps each finite sample to a predictor . We say PAC-learns in the realizable setting if there exists a sample complexity function such that for every realizable and every , if with then
Let denote the pointwise-optimal realizable PAC sample complexity over all learners for .
A local regularizer for is a function . A learner is induced by if for every sample and every test point ,
(Thus may induce multiple learners via tie-breaking.) Say that learns if every learner induced by PAC-learns .
Unsolved Problem
Problem 2024. For multiclass classification, is it true that every realizable-PAC-learnable hypothesis class can be learned by some local regularizer ? If yes, can one choose so that at least one induced learner has sample complexity matching up to constant factors (or more permissively, up to polylogarithmic factors in and )?
§ Discussion
§ Significance & Implications
Local regularization is an ERM-like template: it selects, for each test point , a label among sample-consistent hypotheses by minimizing a pointwise regularizer . The question is whether this restricted optimization-based mechanism is universal for realizable multiclass PAC learning, despite known phenomena unique to multiclass settings (e.g., the need for improper prediction and the use of one-inclusion-graph-based constructions in existing characterizations). A positive answer would yield a simple, uniform recipe for building learners (and potentially near-optimal learners) from an appropriate choice of , avoiding the explicit combinatorial machinery of known optimal learners. A negative answer would separate learnability from learnability-by-local-regularization, pinpointing when dependence on the unlabeled geometry of the sample (as in one-inclusion graph methods) is genuinely necessary and cannot be eliminated by any pointwise regularizer.
§ Known Partial Results
Asilis et al. (2024): The COLT 2024 note formalizes local regularization via and induced learners that, for each test point , output for a sample-consistent minimizing .
Asilis et al. (2024): The note highlights that, unlike binary classification where ERM over attains near-optimal sample complexity when VC dimension is finite, multiclass learning can require qualitatively different (improper) learners.
Daniely et al. (2014): show there exist realizable, PAC-learnable multiclass classes for which any optimal generic learner must be improper; in particular, ERM (and other proper templates optimizing over ) cannot be universally optimal.
Asilis et al. (2024): Existing multiclass learners/characterizations discussed in the note rely on orientations of one-inclusion graphs associated with , which can depend on the unlabeled sample inputs and may be infinite when is infinite.
Asilis et al. (2024): study a strengthened variant, unsupervised local regularization, where the regularizer may also depend on the unlabeled datapoints in the training sample; the note states this variant can learn all multiclass problems with near-optimal sample complexity, leaving open whether the unlabeled dependence can be removed.
Asilis et al. (2024): The note proposes a candidate counterexample class with infinite and labels , defined from triples of finite subsets with and pairwise intersections of size ; each hypothesis outputs off and is constant on each of , , and with values chosen from the corresponding pair of set-labels. The class is argued to be PAC learnable by a simple default- rule, but the note conjectures symmetry may prevent any (supervised) local regularizer from selecting the "natural" label without effectively using unlabeled sample geometry, leading to Open Problem 2 for .
§ References
Open Problem: Can Local Regularization Learn All Multiclass Problems?
Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan, Shang-Hua Teng (2024)
Conference on Learning Theory (COLT), PMLR 247
📍 Open-problem note in COLT proceedings.
Open Problem: Can Local Regularization Learn All Multiclass Problems? (PDF)
Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan, Shang-Hua Teng (2024)
Conference on Learning Theory (COLT), PMLR 247
📍 Proceedings PDF.
Optimal learners for multiclass problems
Amit Daniely, Shai Shalev-Shwartz (2014)
Conference on Learning Theory (COLT), PMLR 35
📍 PMLR
A Characterization of Multiclass Learnability
Nataly Brukhim, Daniel Carmon, Irit Dinur, Shay Moran, Amir Yehudayoff (2022)
IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS 2022)
📍 Related work and partial progress context.