§ Problem Statement
Setup
Let be a stochastic -armed bandit instance. Pulling arm yields an independent sample from an unknown distribution on with finite mean . Assume a unique best arm .
In fixed-budget best-arm identification, an algorithm (possibly adaptive) allocates exactly pulls and then outputs a recommendation . The misidentification probability is
Define the (instance-dependent) asymptotic error exponent achieved by as
and the optimal exponent as
Unsolved Problem
Characterize in the fixed-budget setting (i.e., give an explicit instance-dependent formula together with matching achievability and converse bounds) for general bandit instances with a unique best arm, and design an algorithm that is asymptotically optimal in the sense (or matches under the strongest standard notion of asymptotic equivalence).
§ Discussion
§ Significance & Implications
A sharp characterization of would determine the exact large-deviations rate at which the best-arm misidentification probability can decay as the budget grows, for each instance . This would (i) turn fixed-budget best-arm identification into an instance-by-instance optimization target (what sampling proportions and decision rules are information-theoretically optimal), and (ii) yield definitive instance-dependent lower bounds for any algorithm. It would place the fixed-budget regime on comparable asymptotic footing to the fixed-confidence regime, where the optimal instance-dependent complexity is already characterized.
§ Known Partial Results
Qin (2022): ).
Qin (2022): In the fixed-confidence setting with a unique best arm, the optimal instance-dependent asymptotic complexity is characterized via information-theoretic (Chernoff-type) arguments and is explicitly formulated in modern treatments such as Garivier and Kaufmann (2016).
Qin (2022): In contrast, the COLT 2022 open-problem note (Qin, 2022) emphasizes that an explicit, general, instance-dependent characterization of the fixed-budget optimal error exponent is not yet established, and it discusses related conjectures and questions about what the correct exponent and asymptotically optimal algorithms should be in this dual setting.
§ References
Open Problem: Optimal Best Arm Identification with Fixed-Budget
Chao Qin (2022)
Conference on Learning Theory (COLT), PMLR 178
📍 Open-problem note in COLT proceedings.
Open Problem: Optimal Best Arm Identification with Fixed-Budget (PDF)
Chao Qin (2022)
Conference on Learning Theory (COLT), PMLR 178
📍 Proceedings PDF.