Approximating Ground Truth Causal Estimands
In causal inference, we are often interested in the value of some causal estimand, such as the average treatment effect (ATE) or the average policy effect (APE). These estimands are defined in terms of counterfactuals $Y(a)$, the value of an outcome $Y$ under a given treatment regime. For example, the average treatment effect is denoted
\[\mathbb{E}\Big(Y(1) - Y(0)\Big)\]
where $Y(a)$ denotes the outcome $Y$ that would have occurred had the treatment been set to $a$. Similarly, we denote the average policy effect as
\[\mathbb{E}\Big(Y(a^*) - Y(a)\Big)\]
where $a$ is the natural value of treatment under no intervention and $a^*$ is the value of treatment under some policy.
CausalTables.jl provides functions that numerically approximate the values of several common estimands given a ground truth StructuralCausalModel
object. This can be useful for evaluating the performance of causal inference methods on simulated data. Available estimands include:
- Counterfactual Means (
cfmean
) - Counterfactual Differences (
cfdiff
) - Average Treatment Effect (
ate
), including among the Treated (att
) and Untreated (atu
) - Average Policy Effect (
ape
), also known as the causal effect of a Modified Treatment Policy.
In addition, one can simulate counterfactual outcomes directly using the draw_counterfactual
function. Each of these is documented in detail in the following section. For low-level functions that can be used to approximate more complicated custom ground truth estimands in various settings, see Computing ground truth conditional distributions.
API
CausalTables.additive_mtp
— Methodadditive_mtp(δ)
Constructs a function that adds a constant (or constant vector) δ to the treatment variable(s) in a CausalTable
object. This function is intended to be used as an argument to ape
.
Arguments
- δ: The "additive shift" to be applied to the treatment variable of a
CausalTable
.
Returns
- A function that takes a
CausalTable
object as input and returns a column table of treatments that have been shifted by δ units.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Normal(L)),
Y ~ @.(Normal(A + 2 * L + 1))
)
scm = StructuralCausalModel(dgp, [:A], [:Y], [:L])
ape(scm, additive_mtp(0.5))
CausalTables.ape
— Methodape(scm::StructuralCausalModel, intervention::Function; samples = 10^6)
Approximate the average policy effect for a given structural causal model (SCM), along with its efficiency bound. This is also known as the causal effect of a modified treatment policy, and is approximated using Monte Carlo sampling. Note that unless intervention
is piecewise smooth invertible, the estimated statistical quantity may not have a causal interpretation; see Haneuse and Rotnizky (2013). Mathematically, this is
\[E(Y(d(a) - Y(a))\]
where $d(a)$ represents the intervention on the treatment variable(s) $A$, $Y(d(a))$ represents the counterfactual $Y$ under treatment $d(a)$, and $Y(a)$ represents the counterfactual outcome under the naturally observed value of treatment. This statistical quantity is approximated using Monte Carlo sampling.
Convenience functions for generating intervention
functions include additive_mtp
and multiplicative_mtp
, which construct functions that respectively add or multiply a constant (or constant vector) to the treatment variable(s). One can also implement their own intervention function; this function must take as input a CausalTable
object and return a NamedTuple object with each key indexing a treatment variable that has been modified according to the intervention. Also see cast_matrix_to_table_function
for a convenience function for constructing interventions.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.intervention::Function
: The intervention function to apply to the SCM.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The ATU approximation.eff_bound
: The variance of the difference between the natural and counterfactual responses, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = CausalTables.@dgp(
L ~ Beta(2, 4),
A ~ @.(Normal(L)),
Y ~ @.(Normal(A + 2 * L + 1))
)
scm = CausalTables.StructuralCausalModel(dgp, [:A], [:Y], [:L])
ape(scm, additive_mtp(0.5))
ape(scm, multiplicative_mtp(2.0))
# example of a custom intervention function
custom_intervention = cast_matrix_to_table_function(x -> exp.(x))
ape(scm, custom_intervention)
CausalTables.ate
— Methodate(scm::StructuralCausalModel; samples = 10^6)
Approximate the average treatment effect (ATE) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is
\[E(Y(1) - Y(0))\]
where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The ATE approximation.eff_bound
: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
ate(scm)
CausalTables.att
— Methodatt(scm::StructuralCausalModel; samples = 10^6)
Approximate the average treatment effect among the treated (ATT) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is
\[E(Y(1) - Y(0) \mid A = 1)\]
where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The ATT approximation.eff_bound
: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
att(scm, treat_all, treat_none)
CausalTables.atu
— Methodatu(scm::StructuralCausalModel; samples = 10^6)
Approximate the average treatment effect among the untreated (ATU) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is
\[E(Y(1) - Y(0) \mid A = 0)\]
where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The ATU approximation.eff_bound
: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
atu(scm, treat_all, treat_none)
CausalTables.cast_matrix_to_table_function
— Methodcast_matrix_to_table_function(func::Function)
Wraps a given function func
that operates on a matrix and returns a new function that operates on a CausalTable
object. The returned function converts the CausalTable
's treatment matrix to a table, applies func
to this matrix, and then converts the result back to a column table with the same header as the original treatment matrix.
Arguments
func::Function
: A function that takes a matrix as input and returns a matrix.
Returns
- A function that takes a
CausalTable
object as input and returns a column table.
Example
custom_intervention = cast_matrix_to_table_function(x -> exp.(x))
CausalTables.cfdiff
— Methodcfdiff(scm::StructuralCausalModel, intervention1::Function, intervention2::Function; samples = 10^6)
Approximate the difference between two counterfactual response means – that under intervention1
having been applied to the treatment, and that under intervention2
– for a given structural causal model (SCM), along with its efficiency bound. Mathematically, this is
\[E(Y(d_1(a)) - Y(d_2(a)))\]
where $d_1$ and $d_2$ represent intervention1
and intervention2
being applied on the treatment variable(s) $A$. This statistical quantity is approximated using Monte Carlo sampling.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.intervention1::Function
: The first intervention function to be contrasted.intervention2::Function
: The second intervention function to be contrasted.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The mean difference in counterfactual outcomes.eff_bound
: The variance of the difference in counterfactual responses, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
cfdiff(scm, treat_all, treat_none)
CausalTables.cfmean
— Methodcfmean(scm::StructuralCausalModel, intervention::Function; samples = 10^6)
Approximate the counterfactual mean of the response had intervention
been applied to the treatment, along with its efficiency bound, for a given structural causal model (SCM). Mathematically, this estimand is
\[E(Y(d(a)))\]
where $d(a)$ represents an intervention on the treatment variable(s) $A$. This statistical quantity is approximated using Monte Carlo sampling.
Arguments
scm::StructuralCausalModel
: The SCM from which data is to be simulated.intervention::Function
: The intervention function to apply to the SCM.samples
: The number of samples to draw fromscm
for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.
Returns
A named tuple containing:
μ
: The mean of the counterfactual outcomes.eff_bound
: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
cfmean(scm, treat_all)
cfmean(scm, treat_none)
CausalTables.draw_counterfactual
— Methoddraw_counterfactual(scm::StructuralCausalModel, parents::CausalTable, intervention::Function) -> Vector
Generate counterfactual responses based on a given structural causal model (SCM), a table of response parents, and an intervention function. That is, sample the responses that would have occurred had some intervention been applied to the treatment specified by the structural causal model.
Arguments
scm::StructuralCausalModel
: The structural causal model used to generate counterfactual outcomes.parents::CausalTable
: A table containing the variables causally preceding the response variable.intervention::Function
: A function that defines the intervention to be applied to the parent variables. Usecast_matrix_to_table_function
to convert a function acting on a treatment vector or matrix to a function that acts on aCausalTable
.
Returns
A vector of counterfactual responses.
CausalTables.intervene
— Methodintervene(ct::CausalTable, intervention::Function)
Applies intervention
to the treatment vector(s) within a CausalTable, and outputs a new CausalTable with the intervened treatment.
Arguments
ct::CausalTable
: The data on which treatment should be intervenedintervention::Function
: A function that defines the intervention to be applied to the parent variables. Usecast_matrix_to_table_function
to convert a function acting on a treatment vector or matrix to a function that acts on aCausalTable
.
Returns
A CausalTable
containing the same data as ct
, but with the treatment variable(s) modified accoding to intervention
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
ct = rand(scm, 100)
intervene(ct, treat_all)
CausalTables.multiplicative_mtp
— Methodmultiplicative_mtp(δ)
Constructs a function that scales the treatment variable(s) in a CausalTable
object by a constant δ. This function is intended to be used as an argument to ape
.
Arguments
- δ: The "multiplicative shift" to be applied to the treatment variable of a
CausalTable
.
Returns
- A function that takes a
CausalTable
object as input and returns a column table of treatments that have been scaled by δ units.
Example
using Distributions
dgp = CausalTables.@dgp(
L ~ Beta(2, 4),
A ~ @.(Normal(L)),
Y ~ @.(Normal(A + 2 * L + 1))
)
scm = CausalTables.StructuralCausalModel(dgp, [:A], [:Y], [:L])
ape(scm, multiplicative_mtp(2.0))
CausalTables.treat_all
— Methodtreat_all(ct::CausalTable)
Intervenes on a CausalTable
object by setting all treatment variables to 1.
Arguments
ct::CausalTable
: ACausalTable
object with a univariate binary treatment.
Returns
A NamedTuple
object with the same header as the treatment matrix in ct
, where each treatment variable is set to 1.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
data = rand(scm, 100)
treat_all(data)
CausalTables.treat_none
— Methodtreat_all(ct::CausalTable)
Intervenes on a CausalTable
object by setting all treatment variables to 0.
Arguments
ct::CausalTable
: ACausalTable
object with a univariate binary treatment.
Returns
A NamedTuple
object with the same header as the treatment matrix in ct
, where each treatment variable is set to 0.
Example
using Distributions
dgp = @dgp(
L ~ Beta(2, 4),
A ~ @.(Bernoulli(L)),
Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y, [:L])
data = rand(scm, 100)
treat_none(data)