Unsolved

Is Margin Sufficient for Non-Interactive Private Distributed Learning?

Posed by Amit Daniely et al. (2019)

§ Problem Statement

Setup

Let XX be a domain and let CC be a class of Boolean functions f:X{1,1}f:X\to\{-1,1\}. For d1d\ge 1, let Bd(1)={uRd:u21}B_d(1)=\{u\in\mathbb{R}^d:\|u\|_2\le 1\}. The margin complexity of CC, denoted MC(C)MC(C), is the minimum M1M\ge 1 such that there exist a dimension dd and an embedding Ψ:XBd(1)\Psi:X\to B_d(1) with the property that for every fCf\in C there exists a vector wBd(1)w\in B_d(1) satisfying

minxXf(x)w,Ψ(x)1M.\min_{x\in X} f(x)\,\langle w,\Psi(x)\rangle \ge \frac{1}{M}.

(Non-interactive local differential privacy.) Let Z=X×{1,1}Z=X\times\{-1,1\}. An ε\varepsilon-local randomizer is a randomized map R:ZWR:Z\to W such that for all z,zZz,z'\in Z and all measurable events SWS\subseteq W,

Pr[R(z)S]eεPr[R(z)S].\Pr[R(z)\in S] \le e^{\varepsilon}\Pr[R(z')\in S].

A non-interactive ε\varepsilon-LDP (distribution-independent) PAC learner for CC works as follows: given nn i.i.d. labeled examples (xi,yi)(x_i,y_i) where xiDx_i\sim D for an arbitrary distribution DD over XX and yi=f(xi)y_i=f(x_i) for some unknown target fCf\in C, each example is accessed only once by applying a fixed (non-adaptive) ε\varepsilon-local randomizer to produce messages wi=R(xi,yi)w_i=R(x_i,y_i); the learner then outputs a hypothesis h:X{1,1}h:X\to\{-1,1\}.

Unsolved Problem

Problem 1. Does there exist a polynomial p()p(\cdot) such that for every class CC, every accuracy parameter α(0,1/2)\alpha\in(0,1/2), and every distribution DD over XX, there is a non-interactive 11-LDP PAC learner that, using

np(MC(C),1/α)n \le p\bigl(MC(C),1/\alpha\bigr)

examples, outputs (with probability at least 2/32/3 over its randomness and the sample) a hypothesis hh satisfying PrxD[h(x)f(x)]α\Pr_{x\sim D}[h(x)\ne f(x)]\le \alpha for every target fCf\in C?

§ Discussion

Loading discussion…

§ Significance & Implications

This problem asks whether a single geometric complexity parameter, MC(C)MC(C), suffices to guarantee efficient (polynomial-sample) distribution-independent PAC learning in the strict non-interactive 11-LDP model, where each user sends only one privatized message and the server cannot adapt queries across users. A positive answer would give a broadly applicable, model-specific characterization of which Boolean classes admit one-round locally private learning with sample complexity controlled by MC(C)MC(C) and 1/α1/\alpha. A negative answer would demonstrate that polynomial margin complexity alone does not capture the information constraints of non-interactive local privacy, implying that either additional structure/complexity measures or interaction is inherently necessary for some classes even when MC(C)MC(C) is small.

§ Known Partial Results

  • Daniely et al. (2019): Daniely and Feldman (COLT 2019) formulate the question of whether polynomial margin complexity implies efficient non-interactive locally differentially private PAC learning (distribution-independently) for Boolean concept classes.

  • Daniely et al. (2019): The open problem isolates margin complexity as the candidate controlling parameter and fixes the learning model to the strongest common non-interactivity requirement: one privatized message per example with no adaptivity across examples.

§ References

[1]

Open Problem: Is Margin Sufficient for Non-Interactive Private Distributed Learning?

Amit Daniely, Vitaly Feldman (2019)

Conference on Learning Theory (COLT), PMLR 99

📍 Open-problem note in COLT proceedings.

[2]

Open Problem: Is Margin Sufficient for Non-Interactive Private Distributed Learning? (PDF)

Amit Daniely, Vitaly Feldman (2019)

Conference on Learning Theory (COLT), PMLR 99

📍 Proceedings PDF.

§ Tags