April 15, 2026
Even if \(\mathsf{P}\) totally unknown (nonparametric), can construct “good” plug-in estimator of \(\psi\) by…
\[v^\top\nabla_\varepsilon L(O; \varepsilon)\Big|_{\varepsilon = 0} = \frac{1}{n}\sum_{i=1}^n \phi(\mathsf{P}_n)(O_i; \psi)\]
→ choose \(L\) to respect problem constraints (i.e. bounded outcomes)
The derivation of the efficient influence function is often regarded as somewhat of a “dark art.”
Hines et al. (2022)
Riesz Representation Theorem (statistics version).
Suppose \(\eta \in \mathcal{L}_2(\mathsf{P})\), and \(\psi \coloneqq \Psi(\eta) = \mathsf{E}[h(O; \eta)]\) is a bounded linear functional. Then, there exists a Riesz representer \(\alpha \in \mathcal{L}_2(\mathsf{P})\) such that
\[\Psi(\eta) = \mathsf{E}[\alpha(O)\eta(O)]\]
Think of \(\alpha\) like a “balancing weight”!
Theorem (Riesz EIF). The efficient influence function of \(\mathsf{E}[h(O; \eta)]\) is
\[\phi(\mathsf{P})(O) = \underbrace{\textcolor{teal}{h(O; \eta)} - \Psi(\eta)}_{\text{expected value EIF}} + \underbrace{\int \textcolor{red}{\alpha(O)} \textcolor{blue}{\phi_{\eta}(\mathsf{P})(O)} d\mathsf{P}}_{\text{"reweighted nuisance bias"}}\]
where \(\textcolor{blue}{\phi_\eta(\mathsf{P})(O)}\) denotes the efficient influence function of the nuisance parameter \(\eta\).
Generalizes previous work, like Hirshberg and Wager (2021), Chernozhukov et al. (2022), or Williams et al. (2025)
Example 1: Counterfactual mean \(\Psi(\eta) = \mathsf{E}[\mathsf{E}(Y \mid A = a, L)]\) where \(\eta(A, L) = \mathsf{E}(Y \mid A, L)\). Its EIF is
\[\underbrace{\textcolor{teal}{\mathsf{E}(Y \mid A = a, L)}}_{\substack{\text{evaluator}\\ h(A, L; \eta)}} - \Psi(\eta) + \underbrace{\textcolor{red}{\frac{\mathbb{1}(A = a)}{d\mathsf{P}(A = a \mid L)}}}_{\substack{\text{Riesz representer}\\ \alpha(A, L)}}\underbrace{\textcolor{blue}{(Y - \mathsf{E}(Y \mid A, L))}}_{\substack{\text{derivative of}\\\text{squared loss}}}\]
Integral cancels out because \(\phi_\eta(\mathsf{P})(O) = \frac{\delta_{A, L}}{d\mathsf{P}(A, L)}(Y - \mathsf{E}(Y \mid A, L))\)
Example 2: Counterfactual mean \(\Psi(\eta) = \mathsf{E}[\mathsf{E}(Y \mid A = A + \delta, L)]\) of a policy setting \(A = A + \delta\) where \(\eta(A, L) = \mathsf{E}(Y \mid A, L)\). Its EIF is
\[\underbrace{\textcolor{teal}{\mathsf{E}(Y \mid A = A + \delta, L)}}_{\substack{\text{evaluator}\\ h(A, L; \eta)}} - \Psi(\eta) + \underbrace{\textcolor{red}{\frac{d\mathsf{P}(A - \delta \mid L)}{d\mathsf{P}(A\mid L)}}}_{\substack{\text{Riesz representer}\\ \alpha(A, L)}}\underbrace{\textcolor{blue}{(Y - \mathsf{E}(Y \mid A, L))}}_{\substack{\text{derivative of}\\\text{squared loss}}}\]
Integral cancels out because \(\phi_\eta(\mathsf{P})(O) = \frac{\delta_{A, L}}{d\mathsf{P}(A, L)}(Y - \mathsf{E}(Y \mid A, L))\)
Example 3: Mean \(\tau\)-th quantile \(\Psi(\eta) = \mathsf{E}[Q^{\tau}(Y \mid A = a, L)]\) under treatment where \(\eta(A, L) = Q^{\tau}(Y \mid A, L)\). Its EIF is
\[\underbrace{\textcolor{teal}{Q^\tau(Y \mid A = a, L)}}_{\substack{\text{evaluator}\\ h(A, L; \eta)}} - \Psi(\eta) + \underbrace{\textcolor{red}{\frac{\mathbb{1}(A = a)}{d\mathsf{P}(A = a \mid L)}}}_{\substack{\text{Riesz representer}\\\alpha(A,L)}}\underbrace{\textcolor{blue}{\left(\frac{\tau - \mathbb{1}(Y > Q^\tau(A, L))}{d\mathsf{P}(Q^\tau(A, L) \mid A, L)}\right)}}_{\substack{\text{reweighted derivative of}\\\text{"pinball loss"}}}\]
Integral cancels out because \(\phi_\eta(\mathsf{P})(O) = \frac{\delta_{A, L}}{d\mathsf{P}(A, L)}\left(\frac{\tau - \mathbb{1}(Y > Q^\tau(A, L))}{d\mathsf{P}(Q^\tau(A, L) \mid A, L)}\right)\)
Consider a general time-ordered data structure
\[O = (L_1, A_1, \ldots, L_T, A_T, Y)\]
Denote the histories at time \(t\) as \(\bar{A}_t\) and \(\bar{L}_t\). For example:
Theorem (Sequential Riesz EIF)
Consider the estimand \(\Psi(\eta_1) = \mathsf{E}_{\mathsf{P}}[h_1(A_{1}, L_{1}; \eta_1)]\), where \(\eta_t\) is a bounded linear functional defined sequentially such that, for \(t = 1, \ldots, T\), we have
\[\eta_{t}(\bar{A}_{t}, \bar{L}_{t}) = \mathsf{E}[h_{t+1}(\bar{A}_{t+1}, \bar{L}_{t+1}; \eta_{t+1}) \mid \bar{A}_t, \bar{L}_t]\]
with \(h_{T+1}(\bar{A}_{T+1}, \bar{L}_{T+1}; \eta_{T+1}) \coloneqq Y\). Let \(\alpha_t\) denote the Riesz representer for \(\eta_{t}\) in the functional \(\mathsf{E}[h_{t}(\bar{A}_{t}, \bar{L}_{t}; \eta_{t}) \mid \bar{A}_{t-1}, \bar{L}_{t-1}]\). Then, the EIF of the estimand \(\Psi(\eta_1)\) is
\[\underbrace{\textcolor{teal}{h_1(A_{1}, L_{1}; \eta_1)} - \Psi(\eta_1)}_{\text{Expected value EIF}} + \sum_{t=1}^{T}\underbrace{\textcolor{red}{\prod_{k=1}^t \alpha_t(\bar{A}_{t}, \bar{L}_{t})}}_{\substack{\text{Riesz}\\\text{representer}\\\text{reweighting}}}\underbrace{[\textcolor{teal}{h_{t+1}(\bar{A}_{t+1}, \bar{L}_{t+1}; \eta_{t+1})} - \textcolor{blue}{\eta_t(\bar{A}_{t}, \bar{L}_{t})}]}_{\text{Residuals of sequential regressions}}\]
Sequential TMLE
RieszCML package: Simulations
Preprint
More about me
Theorem (Riesz Representation, general).
Suppose \(\eta \in \mathcal{H}\), a Hilbert space, and that \(\psi(\eta) : \mathcal{H} \mapsto \mathbb{R}\) is a bounded linear functional.
Then, there exists \(\alpha \in \mathcal{H}\) such that
\[\psi(\eta) = \langle \alpha, \eta \rangle\]
EuroCIM 2026