Regret Bounds for Noise-Free Kernel-Based Bandits
Posed by Sattar Vakili (2022)
§ Problem Statement
Setup
Let , let be a known positive definite kernel with RKHS , and fix a known radius . An unknown deterministic function satisfies . At each round , an algorithm (possibly randomized) chooses measurably as a function of the past and then observes the exact value (noise-free feedback). Let and define cumulative regret
Write for rates that hide factors polylogarithmic in .
Unsolved Problem
Problem 2022. Determine the minimax optimal growth rate
as a function of in the noise-free kernel-based bandit setting, and to give matching algorithmic upper bounds and information-theoretic lower bounds. In particular, when is a (stationary, isotropic) Matern kernel on with smoothness parameter , determine the smallest exponent such that some algorithm achieves , and prove a matching lower bound (up to polylog factors).
More specifically, prove or refute the COLT 2022 conjecture that (up to polylogarithmic factors)
§ Discussion
§ Significance & Implications
In the noisy RKHS/GP bandit model, regret rates are now characterized (up to log factors) via information-gain-type complexity terms, but Vakili (COLT 2022) highlights that transferring those analyses to exact (noise-free) observations leaves a substantial, kernel-dependent gap and may even fail to guarantee sublinear regret for some kernels. Pinning down the minimax noise-free rate for Matern kernels would (i) identify the true exploration cost when observations are exact, (ii) clarify whether cumulative regret can be bounded by or once smoothness meets or exceeds dimension , and (iii) provide a target rate that future deterministic Bayesian-optimization-style algorithms and lower-bound proofs must match.
§ Known Partial Results
Vakili (2022): Vakili (COLT 2022) formulates the noise-free kernel bandit problem with , , exact observations , and cumulative regret .
Vakili (2022): The note contrasts this with the noisy setting, where regret bounds for GP-style methods are often expressed in terms of a maximal information gain quantity (e.g., ), yielding rates of the form or depending on the analysis.
Vakili (2022): The note reports that such noisy-style bounds can be loose in the noise-free regime: for Matern kernels they lead to a regret exponent of the form , which the note argues is not order-optimal when observations are exact.
Vakili (2022): The note discusses multiple noise-free upper bounds obtained via different strategies (including an explore-then-commit style approach) and emphasizes that, while these bounds can be sublinear, none are proved to match a minimax lower bound for the noise-free problem.
Vakili (2022): The note states a conjectured piecewise-optimal rate for Matern kernels, for , for , and for , motivated heuristically by controlling sums of GP posterior standard deviations under near-uniform sampling.
§ References
Open Problem: Regret Bounds for Noise-Free Kernel-Based Bandits
Sattar Vakili (2022)
Conference on Learning Theory (COLT), PMLR 178
📍 Open-problem note in COLT proceedings.
Open Problem: Regret Bounds for Noise-Free Kernel-Based Bandits (PDF)
Sattar Vakili (2022)
Conference on Learning Theory (COLT), PMLR 178
📍 Proceedings PDF.
Convergence rates of efficient global optimization algorithms
Adam D. Bull (2011)
Journal of Machine Learning Research
📍 Mentioned in the provided partial results as a source of (near-)optimal simple-regret rates for EI under Matern-type smoothness.