Unsolved

Is There a First-Order Method that Only Converges to Local Minimax Optima?

Posed by Jiseok Chae et al. (2023)

§ Problem Statement

Setup

Let f:Rd×RmRf: \mathbb{R}^d \times \mathbb{R}^m \to \mathbb{R} be twice continuously differentiable (not necessarily convex-concave), and consider the minimax problem

minxRd  maxyRmf(x,y).\min_{x \in \mathbb{R}^d}\; \max_{y \in \mathbb{R}^m} f(x,y).

Following the optimization order (min in xx then max in yy), call (x,y)(x^*,y^*) a (possibly non-strict) local minimax point if there exists r>0r>0 such that:

  1. yy^* is a local maximizer of yf(x,y)y \mapsto f(x^*,y) (i.e., δ>0\exists\,\delta>0 such that f(x,y)f(x,y)f(x^*,y^*) \ge f(x^*,y) for all yyδ\|y-y^*\| \le \delta); and

  2. xx^* is a local minimizer of the localized upper-envelope ϕr(x):=maxyyrf(x,y)\phi_r(x) := \max_{\|y-y^*\| \le r} f(x,y) (i.e., ϵ>0\exists\,\epsilon>0 such that ϕr(x)ϕr(x)\phi_r(x^*) \le \phi_r(x) for all xxϵ\|x-x^*\| \le \epsilon). (Strict local minimax strengthens these inequalities to be strict for all nearby unequal points.)

Unsolved Problem

Does there exist a first-order method A\mathcal{A} for general C2C^2 objectives (an algorithm that, at each iteration, may use past iterates, its internal randomness, and gradient queries xf(xt,yt)\nabla_x f(x_t,y_t) and yf(xt,yt)\nabla_y f(x_t,y_t), but no higher-order information) such that for every C2C^2 function ff and every initialization distribution, with probability 1 every accumulation point (subsequence limit) of the iterates (xt,yt)(x_t,y_t) generated by A\mathcal{A} is a local minimax point (including non-strict local minimax points)? Equivalently, can one design a first-order method whose limiting behavior provably avoids convergence to local maximin points and other stationary points that are not local minimax points under this order-aware notion?

§ Discussion

Loading discussion…

§ Significance & Implications

A positive answer would provide an order-aware analogue of the informal guarantee sought in nonconvex minimization (that gradient-based dynamics converge only to local optima) for nonconvex-nonconcave minimax optimization. The COLT 2023 note specifically motivates this by GAN training: if mode collapse and other instabilities are driven by convergence to solutions aligned with maximin rather than the intended minimax objective, then a method that provably rules out non-minimax attractors would clarify what first-order information can guarantee and could guide the design of more stable minimax training algorithms. It would also sharpen which definition(s) of local minimax optimality are operationally meaningful for analyzing first-order dynamics in nonconvex-nonconcave games.

§ Known Partial Results

  • Chae et al. (2023): The COLT 2023 open-problem note asserts that existing gradient-based methods for nonconvex-nonconcave minimax problems do not (in theory or in practice) have the property that all their limit behavior is restricted to (local) minimax optima.

  • Chae et al. (2023): Jin et al. proposed an order-aware notion of local minimax optimality for nonconvex-nonconcave minimax problems and gave a partial positive result: two-timescale gradient descent-ascent converges to strict local minimax optima.

  • Chae et al. (2023): The same discussion highlights that convergence to general (non-strict) local minimax points has been left largely unexplored, despite the claim that non-strict local minimax points are prevalent.

  • Chae et al. (2023): The COLT 2023 note reports recent findings that a two-timescale extragradient method can reach some non-strict local minimax points, providing partial evidence toward the broader goal.

  • Chae et al. (2023): The note emphasizes that further progress likely depends on refining and agreeing on appropriate definition(s) of local minimax optimality in the nonconvex-nonconcave setting.

§ References

[1]

Open Problem: Is There a First-Order Method that Only Converges to Local Minimax Optima?

Jiseok Chae, Kyuwon Kim, Donghwan Kim (2023)

Conference on Learning Theory (COLT), PMLR 195

📍 Open-problem note in COLT proceedings.

[2]

Open Problem: Is There a First-Order Method that Only Converges to Local Minimax Optima? (PDF)

Jiseok Chae, Kyuwon Kim, Donghwan Kim (2023)

Conference on Learning Theory (COLT), PMLR 195

📍 Proceedings PDF.

§ Tags