The Oracle Complexity of Convex Optimization with Limited Memory
Posed by Blake Woodworth et al. (2019)
§ Problem Statement
Setup
Fix dimension and parameters . Let be the class of convex functions on the Euclidean ball that are -Lipschitz in , i.e., for all (equivalently, every subgradient satisfies wherever it exists). A first-order oracle, queried at , returns a pair where is an arbitrary valid subgradient.
A deterministic first-order algorithm with memory budget bits and oracle queries is an interactive procedure that, between queries, retains only a state (initialized to ). At round , it chooses , receives with , updates , and after rounds outputs . The algorithm may perform arbitrary computation during a round, but only bits persist across rounds.
Define the minimax memory-bounded oracle complexity as the smallest for which there exists such an -bit deterministic algorithm guaranteeing
(If no such exists, set .)
Unsolved Problem
Problem 2019. Characterize , i.e., the achievable query-memory tradeoffs , up to constant factors (and at most polylogarithmic factors). In particular, determine whether one can achieve near-optimal first-order query complexity with subquadratic memory for some fixed , or whether achieving such inherently requires (up to logarithmic factors) bits of memory.
§ Discussion
§ Significance & Implications
For nonsmooth convex optimization with an -Lipschitz objective over a radius- Euclidean ball, the optimal first-order oracle query complexity (without memory constraints) scales as in the worst case, and is achieved by cutting-plane/center-of-mass type methods. However, known near-optimal-query methods maintain rich geometric information (e.g., an evolving polytope/ellipsoid defined by many past cuts), suggesting a quadratic-in- memory footprint. This problem asks whether that quadratic memory is an artifact of current implementations or an information-theoretic necessity, thereby isolating the fundamental tradeoff between two core resources for first-order optimization: number of oracle queries versus persistent memory.
§ Known Partial Results
Woodworth et al. (2019): Near-optimal query rates are achievable without memory limits: classical cutting-plane methods (e.g., center-of-mass) attain first-order queries for Lipschitz convex optimization over , and this dependence is minimax-optimal in the regime considered by the note.
Woodworth et al. (2019): Low-memory, simple iterative methods can be made memory-efficient but require many more queries: a discretized, memory-bounded gradient-descent-type scheme achieves -suboptimality using oracle queries with bits of persistent memory (note Appendix A, Theorem 1).
Woodworth et al. (2019): Any method that succeeds uniformly over all must store nontrivial information: for , at least bits of memory are necessary even to output an -suboptimal point in the worst case (note Appendix C, Theorem 5).
Woodworth et al. (2019): The note gives an explicit finite-precision/memory accounting showing that a center-of-mass-style method can retain the query bound while using bits of memory (note Appendix B).
Woodworth et al. (2019): The intermediate tradeoff region is open: it is not known whether one can simultaneously approach the -query regime and use near-linear (in ) memory, or whether improving query complexity beyond the regime forces (up to logarithmic factors) quadratic memory in .
§ References
Open Problem: The Oracle Complexity of Convex Optimization with Limited Memory
Blake Woodworth, Nathan Srebro (2019)
Conference on Learning Theory (COLT), PMLR 99
📍 Open-problem note in COLT proceedings.
Open Problem: The Oracle Complexity of Convex Optimization with Limited Memory (PDF)
Blake Woodworth, Nathan Srebro (2019)
Conference on Learning Theory (COLT), PMLR 99
📍 Proceedings PDF.