Unsolved

Running time complexity of accelerated $\ell_1$-regularized PageRank

Posed by Kimon Fountoulakis et al. (2022)

§ Problem Statement

Setup

Let G=(V,E)G=(V,E) be a (possibly directed) graph with n=Vn=|V| and let PRn×nP\in\mathbb{R}^{n\times n} be a row-stochastic random-walk matrix (so P1=1P\mathbf{1}=\mathbf{1} and Pij0P_{ij}\ge 0). Fix a teleportation parameter α(0,1]\alpha\in(0,1] and a preference distribution sRns\in\mathbb{R}^n with s0s\ge 0 and 1s=1\mathbf{1}^\top s=1. The Personalized PageRank (PPR) vector pRnp\in\mathbb{R}^n is the unique solution of

p=αs+(1α)Pp,equivalently (I(1α)P)p=αs.p=\alpha s+(1-\alpha)Pp,\qquad\text{equivalently }(I-(1-\alpha)P)p=\alpha s.

For a sparsity/regularization parameter ρ>0\rho>0, define the 1\ell_1-regularized PPR objective

F(x):=12(I(1α)P)xαs22+ρx1,F(x):=\tfrac12\left\|(I-(1-\alpha)P)x-\alpha s\right\|_2^2+\rho\|x\|_1,

and let xρargminxRnF(x)x_\rho\in\arg\min_{x\in\mathbb{R}^n} F(x). Consider computing an approximate minimizer x^\hat x by first-order composite optimization methods, e.g. guaranteeing a prescribed accuracy such as F(x^)F(xρ)εF(\hat x)-F(x_\rho)\le \varepsilon for a chosen ε>0\varepsilon>0.

The COLT 2022 open-problem note reports that the best currently analyzed method for this task is (non-accelerated) proximal gradient, with total running time O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1}) (up to logarithmic factors) and, importantly, a bound that does not depend on nn.

Unsolved Problem

Determine the worst-case total running time complexity of an accelerated proximal gradient method (e.g. a Nesterov/FISTA-type scheme) applied to FF under a comparable accuracy criterion. In particular:

  1. Can one prove an nn-independent worst-case bound of the form
O~((αρ)1)\tilde{\mathcal{O}}\left((\sqrt{\alpha}\,\rho)^{-1}\right)

thereby achieving a 1/α1/\sqrt{\alpha} improvement over O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1})?

  1. Failing that, can one at least prove that acceleration cannot be asymptotically worse in the worst case than O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1}), or else exhibit a family of instances for which accelerated proximal gradient is provably worse (in worst-case total running time) than the non-accelerated proximal gradient method?

§ Discussion

Loading discussion…

§ Significance & Implications

The appeal of 1\ell_1-regularized PPR is that it targets sparse/thresholded solutions and admits algorithms whose proven total running time can be independent of the ambient graph size nn, which is critical in large-scale network settings. The currently best analyzed complexity O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1}) scales poorly as α0\alpha\to 0, a regime emphasized as practically relevant in the COLT 2022 note. Establishing (or refuting) an accelerated nn-independent guarantee around O~((αρ)1)\tilde{\mathcal{O}}((\sqrt{\alpha}\,\rho)^{-1}) would clarify whether one can obtain a principled 1/α1/\sqrt{\alpha} speedup for this widely used regularized PPR primitive without sacrificing the key property of size-independent worst-case bounds.

§ Known Partial Results

  • Fountoulakis et al. (2022): Prior work (as summarized in the COLT 2022 open-problem note) motivates 1\ell_1-regularization for PPR as a mechanism that automatically thresholds small PPR probabilities, and relates this to early termination, yielding approximate (sparse) PPR computations.

  • Fountoulakis et al. (2022): The COLT 2022 note states that the fastest currently analyzed method for computing 1\ell_1-regularized PPR is a proximal gradient method with total running time O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1}).

  • Fountoulakis et al. (2022): A key feature emphasized in the note is that this O~((αρ)1)\tilde{\mathcal{O}}((\alpha\rho)^{-1}) guarantee does not depend on nn (the dimension/graph size), which is treated as a prerequisite for modern large-scale networks.

  • Fountoulakis et al. (2022): A natural idea is to apply acceleration (e.g. FISTA-type) to proximal gradient, with the goal of improving the dependence on α\alpha to O~((αρ)1)\tilde{\mathcal{O}}((\sqrt{\alpha}\,\rho)^{-1}).

  • Fountoulakis et al. (2022): The note cautions that the existing proximal-gradient analysis in the cited prior work does not directly carry over to the accelerated variant.

  • Fountoulakis et al. (2022): The note reports empirical evidence that accelerated proximal gradient can reduce total running time in practice, but leaves open whether acceleration has a provable worst-case improvement, or could even be worse in the worst case.

§ References

[1]

Open Problem: Running time complexity of accelerated $\ell_1$-regularized PageRank

Kimon Fountoulakis, Shenghao Yang (2022)

Conference on Learning Theory (COLT), PMLR 178

📍 Open-problem note in COLT proceedings.

[2]

Open Problem: Running time complexity of accelerated $\ell_1$-regularized PageRank (PDF)

Kimon Fountoulakis, Shenghao Yang (2022)

Conference on Learning Theory (COLT), PMLR 178

📍 Proceedings PDF.

§ Tags