Unsolved

Robust Confounder Selection Under Imperfect Primary-Set Elicitation

Sourced from the work of F. Richard Guo, Qingyuan Zhao

§ Problem Statement

Setup

Fix two distinct observed variables XX (treatment) and YY (outcome). Let G~\tilde G be a causal directed acyclic graph on vertex set VUV\cup U, where VV are observed variables (including X,YX,Y) and UU are latent (unobserved) variables. For any observational distribution PP that is Markov and faithful to G~\tilde G and satisfies positivity, call a set SV{X,Y}S\subseteq V\setminus\{X,Y\} a valid covariate-adjustment set for the total causal effect of XX on YY if the adjustment formula

P(ydo(x))=sP(yx,s)P(s)P(y\mid do(x))=\sum_{s} P(y\mid x,s)P(s)

holds for all x,yx,y (with integral form for continuous SS) for every such PP. Let

A(X,Y;G~):={SV{X,Y}:S is a valid adjustment set}.\mathcal A(X,Y;\tilde G):=\{S\subseteq V\setminus\{X,Y\}: S \text{ is a valid adjustment set}\}.

Consider an interactive algorithm that can ask at most BB queries. At round tt, there is a target set PtP_t^\star (the exact primary set that would be provided in the noiseless/oracle procedure), but the algorithm receives a noisy response P^t\widehat P_t. Let Ht1H_{t-1} be the full interaction history before round tt. Two noise models of interest are: (i) mis-elicitation probability bound Pr(P^tPtHt1)ε\Pr(\widehat P_t\neq P_t^\star\mid H_{t-1})\le \varepsilon for all tt; (ii) bounded set-error model with metric d(A,B):=ABd(A,B):=|A\triangle B| and either deterministic bound d(P^t,Pt)δd(\widehat P_t,P_t^\star)\le \delta or high-probability bound Pr(d(P^t,Pt)>δHt1)ε\Pr(d(\widehat P_t,P_t^\star)>\delta\mid H_{t-1})\le \varepsilon.

Unsolved Problem

Characterize whether there exists an interactive procedure Π\Pi (possibly randomized), using at most BB noisy queries, that outputs either a set SV{X,Y}S\subseteq V\setminus\{X,Y\} or a failure symbol \bot, and satisfies a uniform correctness guarantee

Pr([Π outputs some SA(X,Y;G~)]    [A(X,Y;G~)])1η\Pr\Big(\big[\Pi \text{ outputs some } S\in\mathcal A(X,Y;\tilde G)\big]\iff\big[\mathcal A(X,Y;\tilde G)\neq\varnothing\big]\Big)\ge 1-\eta

for every causal graph G~\tilde G and every noise process obeying the chosen (ε,δ)(\varepsilon,\delta) constraints. In particular, determine necessary and sufficient conditions, and explicit quantitative bounds, relating (ε,δ,η,B)(\varepsilon,\delta,\eta,B) under which such robust soundness-and-completeness is achievable.

§ Discussion

Loading discussion…

§ Significance & Implications

The paper's proved guarantee is in an ideal-oracle setting with perfectly correct primary-set input at each step. A noisy-elicitation robustness theory is a natural downstream extension for practical deployment, but should be treated as extrapolative framing unless broader literature verification is completed. See Guo & Zhao (2023) for the oracle formulation.

§ Known Partial Results

  • Guo et al. (2023): Guo and Zhao establish soundness/completeness in the ideal-oracle setting where primary adjustment sets are correctly specified at each iteration. A full theory for bounded/noisy elicitation appears not established in this source and is framed here as an extension.

§ References

[1]

Confounder Selection via Iterative Graph Expansion

F. Richard Guo, Qingyuan Zhao (2023)

Annals of Statistics (to appear)

📍 Abstract (arXiv PDF): guarantee is stated for the case where the user correctly specifies the primary adjustment sets at each step.

Primary source for the oracle interactive method and its soundness/completeness guarantees; the noisy-elicitation robustness formulation here is a downstream proposed extension rather than a directly stated theorem/problem in the paper.

§ Tags