Unsolved

Optimal Rates for Stochastic Decision-Theoretic Online Learning Under Differentially Privacy

Posed by Bingshan Hu et al. (2024)

§ Problem Statement

Setup

Stochastic decision-theoretic online learning (full information) with privacy. Fix an integer K2K\ge 2 and horizon T1T\ge 1. On each round t=1,,Tt=1,\dots,T, the learner outputs an action It[K]:={1,,K}I_t\in[K]:=\{1,\dots,K\} (possibly randomized). Then a loss vector t=(1,t,,K,t)[0,1]K\ell_t=(\ell_{1,t},\dots,\ell_{K,t})\in[0,1]^K is realized and fully revealed to the learner. For each action j[K]j\in[K], the losses (j,t)t=1T(\ell_{j,t})_{t=1}^T are i.i.d. draws from an unknown distribution PjP_j on [0,1][0,1], and the draws are independent across actions and rounds. Let μj:=E[j,1]\mu_j:=\mathbb E[\ell_{j,1}], assume the optimal action is unique with j:=argminj[K]μjj^*:=\arg\min_{j\in[K]}\mu_j, define gaps Δj:=μjμj\Delta_j:=\mu_j-\mu_{j^*} and Δmin:=minjjΔj>0\Delta_{\min}:=\min_{j\ne j^*}\Delta_j>0. The (expected) pseudo-regret is

RegT:=E[t=1TIt,t]minj[K]E[t=1Tj,t]=E[t=1T(μItμj)]=jjΔjE[Nj(T)],\mathrm{Reg}_T:=\mathbb E\Big[\sum_{t=1}^T \ell_{I_t,t}\Big]-\min_{j\in[K]}\mathbb E\Big[\sum_{t=1}^T \ell_{j,t}\Big] =\mathbb E\Big[\sum_{t=1}^T (\mu_{I_t}-\mu_{j^*})\Big]=\sum_{j\ne j^*}\Delta_j\,\mathbb E[N_j(T)],

where Nj(T):={tT:It=j}N_j(T):=|\{t\le T: I_t=j\}| and the expectation is over both the losses and the learner's randomness. An online algorithm MM is ε\varepsilon-differentially private in the prefix (online) sense if for every tTt\le T, for every pair of loss prefixes (1,,t)(\ell_1,\dots,\ell_t) and (1,,t)(\ell_1',\dots,\ell_t') that differ in at most one round, and for every measurable set D[K]tD\subseteq[K]^t of action prefixes,

Pr((I1,,It)D1,,t)eεPr((I1,,It)D1,,t).\Pr\big((I_1,\dots,I_t)\in D\mid \ell_1,\dots,\ell_t\big)\le e^{\varepsilon}\,\Pr\big((I_1,\dots,I_t)\in D\mid \ell_1',\dots,\ell_t'\big).

In the non-private case (ε=\varepsilon=\infty), the optimal gap-dependent rate is RegT=Θ((logK)/Δmin)\mathrm{Reg}_T=\Theta\big((\log K)/\Delta_{\min}\big) (up to constants) in this stochastic full-information model.

Unsolved Problem

Characterize, as a function of (K,T,Δmin,ε)(K,T,\Delta_{\min},\varepsilon), the optimal achievable gap-dependent pseudo-regret among all ε\varepsilon-DP algorithms: give matching (up to universal constants and at most logarithmic factors) upper and lower bounds on

infM ε-DP supinstances with given (K,T,Δmin) RegT(M).\inf_{M\ \varepsilon\text{-DP}}\ \sup_{\text{instances with given }(K,T,\Delta_{\min})}\ \mathrm{Reg}_T(M).

§ Discussion

Loading discussion…

§ Significance & Implications

This problem asks for the sharp privacy-utility tradeoff in one of the cleanest stochastic online learning settings where full loss vectors are observed. Without privacy, the gap-dependent rate essentially does not grow with TT once Δmin>0\Delta_{\min}>0 (only logarithmically in KK), so any unavoidable privacy penalty can be isolated and quantified. A precise characterization would (i) identify the correct dependence on ε\varepsilon and how it interacts with Δmin\Delta_{\min} (e.g., whether there are distinct regimes Δminε\Delta_{\min}\ll\varepsilon vs. Δminε\Delta_{\min}\gg\varepsilon), and (ii) determine whether extra factors sometimes seen in private analyses (such as logT\log T terms from repeated private selection/leader updates) are information-theoretically necessary or merely artifacts of existing algorithmic templates. The answer would directly inform the design of instance-optimal private learners in full-information environments and provide a baseline for more complex private online decision problems.

§ Known Partial Results

  • Hu et al. (2024): Non-private baseline: in the stochastic full-information model described above, the optimal gap-dependent pseudo-regret rate is O((logK)/Δmin)O\big((\log K)/\Delta_{\min}\big), and this dependence on (K,Δmin)(K,\Delta_{\min}) is known to be unimprovable up to constants.

  • Hu et al. (2024): Privacy notion: the open problem uses an online (prefix) ε\varepsilon-DP guarantee that must hold for the distribution of the entire action prefix (I1,,It)(I_1,\dots,I_t) at every time tt, under a single-round change in the observed loss sequence.

  • Hu et al. (2024): Existing private bounds (as summarized in the COLT 2024 open-problem entry): known ε\varepsilon-DP algorithms achieve worst-case (gap-independent) pseudo-regret bounds that add an explicit privacy-dependent term scaling on the order of (logKlogT)/ε(\log K\,\log T)/\varepsilon (up to constants and other problem-dependent factors).

  • Hu et al. (2024): Existing private gap-dependent analyses (as summarized in the COLT 2024 open-problem entry): refined gap-dependent upper bounds are known for private leader/selection-style methods, but in the worst case these can still include privacy penalties comparable to (logKlogT)/ε(\log K\,\log T)/\varepsilon even when Δmin>0\Delta_{\min}>0.

  • Hu et al. (2024): Existing lower bound (as summarized in the COLT 2024 open-problem entry): there is a gap-dependent lower bound of the form Ω((logK)/min{Δmin,ε})\Omega\big((\log K)/\min\{\Delta_{\min},\varepsilon\}\big), suggesting that privacy can force regret to scale at least like (logK)/ε(\log K)/\varepsilon when εΔmin\varepsilon\ll\Delta_{\min}.

  • Hu et al. (2024): Main gap: it remains open to determine the minimax-optimal (up to logarithmic factors) dependence of RegT\mathrm{Reg}_T on (K,Δmin,ε)(K,\Delta_{\min},\varepsilon) under prefix ε\varepsilon-DP, in particular whether logT\log T dependence in the privacy term is inherent in this stochastic full-information setting.

§ References

[1]

Open Problem: Optimal Rates for Stochastic Decision-Theoretic Online Learning Under Differentially Privacy

Bingshan Hu, Nishant A. Mehta (2024)

Conference on Learning Theory (COLT), PMLR 247

📍 Open-problem note in COLT proceedings.

[2]

Open Problem: Optimal Rates for Stochastic Decision-Theoretic Online Learning Under Differentially Privacy (PDF)

Bingshan Hu, Nishant A. Mehta (2024)

Conference on Learning Theory (COLT), PMLR 247

📍 Proceedings PDF.

§ Tags