Fast and Optimal Online Portfolio Selection

§ Problem Statement

Setup

Online portfolio selection with $d\ge 2$ assets proceeds for rounds $t=1,\dots,T$ as follows. On round $t$ , the learner chooses a portfolio vector $w_t\in\Delta_d$ , where $\Delta_d:=\{w\in\mathbb{R}^d: w_i\ge 0,\ \sum_{i=1}^d w_i=1\}$ . Then the market reveals a price-relative vector $x_t\in\mathbb{R}^d_{>0}$ and the learner's wealth is multiplied by $\langle w_t,x_t\rangle$ . Equivalently, define the per-round log-loss

\ell_t(w):=-\log\langle w,x_t\rangle.

The (static) regret after $T$ rounds is

R_T:=\sum_{t=1}^T \ell_t(w_t)-\min_{w\in\Delta_d}\sum_{t=1}^T \ell_t(w).

Consider the follow-the-regularized-leader (FTRL) method with the log-barrier regularizer $\psi(w):=-\sum_{i=1}^d\log w_i$ , optimizing over the simplex interior $\{w\in\Delta_d: w_i>0\}$ :

w_{t+1}\in\arg\min_{w\in\Delta_d,\ w_i>0}\Big(\sum_{s=1}^t \ell_s(w)+\frac{1}{\eta_t}\,\psi(w)\Big),

where $\eta_t>0$ may be a constant or a schedule chosen by the algorithm.

Unsolved Problem

Prove or refute that there exists a computationally feasible instantiation of this FTRL-with-log-barrier approach (i.e., a concrete choice of $\eta_t$ and tie-breaking if needed) whose regret $R_T$ matches the minimax-optimal worst-case regret rate for online portfolio selection (up to constant factors and/or lower-order terms as a function of $d$ and $T$ ), without imposing any a priori bounded-gradient (or equivalent) assumption beyond the protocol condition $x_t\in\mathbb{R}^d_{>0}$ .

§ Discussion

Loading discussion…

§ Significance & Implications

The COLT 2020 note identifies a three-way gap among existing approaches: some have optimal regret guarantees but are computationally infeasible, others are efficient but lack optimal regret guarantees, and still others rely on bounded-gradient-type conditions that are not guaranteed by the protocol. Establishing (or disproving) minimax-optimal regret for a natural, computationally feasible FTRL algorithm based on the log barrier would directly close this gap for the standard online portfolio model and would require analysis techniques for FTRL that do not hinge on bounded gradients, with potential spillovers to other self-concordant-regularized online convex optimization problems.

§ Known Partial Results

Erven et al. (2020): The COLT 2020 open-problem note (Van Erven, Van der Hoeven, Kotlowski, Koolen; PMLR 125) states that state-of-the-art online portfolio methods each fail on at least one axis: computational feasibility, optimal regret guarantees, or avoidance of bounded-gradient assumptions.
Erven et al. (2020): The note proposes the log-barrier-regularized FTRL update as a natural approach and emphasizes that it is computationally feasible.
Erven et al. (2020): The central unresolved question posed by the note is whether this log-barrier FTRL approach attains the optimal (minimax) regret for online portfolio selection under the basic protocol (with $x_t\in\mathbb{R}^d_{>0}$ ) and without bounded-gradient assumptions.
Erven et al. (2020): The note suggests that resolving the question will likely require new proof techniques for analyzing FTRL.
Erven et al. (2020): The note points to technical connections to self-concordance and mentions prior use of self-concordant ideas in bandit convex optimization.

§ References

[1]

Open Problem: Fast and Optimal Online Portfolio Selection

Tim Van Erven, Dirk Van der Hoeven, Wojciech Kotłowski, Wouter M. Koolen (2020)

Conference on Learning Theory (COLT), PMLR 125

📍 Open-problem note in COLT proceedings.

Link ↗

[2]

Open Problem: Fast and Optimal Online Portfolio Selection (PDF)

Tim Van Erven, Dirk Van der Hoeven, Wojciech Kotłowski, Wouter M. Koolen (2020)

Conference on Learning Theory (COLT), PMLR 125

📍 Proceedings PDF.

Link ↗

§ Tags

colt-open-problem learning-theory online-learning portfolio-selection