Unsolved

Beyond smooth finite-dimensional targets in unified semiparametric data fusion

Sourced from the work of Ellen Sandra Graham, Marco Carone, Andrea Rotnitzky

§ Problem Statement

Setup

Let XXX\in\mathcal X denote an unobserved full-data random element with unknown law PP in a statistical model P0\mathcal P_0. There are K2K\ge 2 independent data sources. Source kk contains i.i.d. observations Wk,1,,Wk,nkW_{k,1},\dots,W_{k,n_k} from law QkQ_k, where each WkW_k takes values in Wk\mathcal W_k and is generated from XX through a known observation operator TkT_k (possibly many-to-one, corresponding to missingness/coarsening/measurement error), so that Qk=Tk(P)Q_k=T_k(P). The analyst observes only {Wk,i:1ink,1kK}\{W_{k,i}:1\le i\le n_k,\,1\le k\le K\}, not XX.

This setup follows Graham et al. (2024).

Assume the fusion model is defined by known alignment restrictions linking the source laws to a common target law PP, for example conditional or marginal equalities expressible as

A(P)={Q1,,QK},\mathcal A(P)=\{Q_1,\dots,Q_K\},

with A\mathcal A known. The observed-data model is thus

Q={(Q1,,QK):PP0 such that A(P)={Q1,,QK}}.\mathcal Q=\{(Q_1,\dots,Q_K): \exists P\in\mathcal P_0 \text{ such that } \mathcal A(P)=\{Q_1,\dots,Q_K\}\}.

Let Ψ:P0Θ\Psi:\mathcal P_0\to\Theta be the target functional, where Θ\Theta may be infinite-dimensional (for example a function space such as L2L^2 or \ell^\infty), and Ψ\Psi may be nonregular (not pathwise differentiable at some or all PP).

Unsolved Problem

Formulate a general multi-source semiparametric fusion theory beyond smooth finite-dimensional targets that clarifies, under explicit conditions on A\mathcal A, P0\mathcal P_0, and Ψ\Psi, (i) identification (point or set) of Ψ(P)\Psi(P) from (Q1,,QK)(Q_1,\dots,Q_K), (ii) observed-data tangent/cone geometry and canonical gradients when they exist, and (iii) sharp efficiency bounds for regular components together with appropriate local asymptotic minimax lower bounds and rates when regular estimation fails.

This framing extends the baseline paper's stated scope limitation.

§ Discussion

Loading discussion…

§ Significance & Implications

The baseline paper Graham et al. (2024) states scope for smooth finite-dimensional parameters. Many practically important fusion targets (e.g., function-valued, boundary, or otherwise nonregular functionals) fall outside that class, so extending the framework could materially broaden applicability. As a direction rather than an author-posed conjecture, this appears open as of February 16, 2026, with uncertainty about very recent or parallel unpublished progress.

§ Known Partial Results

  • Graham et al. (2024): The paper gives influence-function and efficient-influence-function theory for smooth finite-dimensional pathwise differentiable parameters under generalized alignment structures.

§ References

[1]

Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data

Ellen Sandra Graham, Marco Carone, Andrea Rotnitzky (2024)

Annals of Statistics (to appear)

📍 Abstract scope statement (smooth finite-dimensional parameter); used as motivation rather than as an explicit open-problem statement.

Baseline source motivating this extension; the exact problem wording here is a formalized extension.

§ Tags