Approximating Ground Truth Causal Estimands

In causal inference, we are often interested in the value of some causal estimand, such as the average treatment effect (ATE) or the average policy effect (APE). These estimands are defined in terms of counterfactuals $Y(a)$, the value of an outcome $Y$ under a given treatment regime. For example, the average treatment effect is denoted

\[\mathbb{E}\Big(Y(1) - Y(0)\Big)\]

where $Y(a)$ denotes the outcome $Y$ that would have occurred had the treatment been set to $a$. Similarly, we denote the average policy effect as

\[\mathbb{E}\Big(Y(a^*) - Y(a)\Big)\]

where $a$ is the natural value of treatment under no intervention and $a^*$ is the value of treatment under some policy.

CausalTables.jl provides functions that numerically approximate the values of several common estimands given a ground truth StructuralCausalModel object. This can be useful for evaluating the performance of causal inference methods on simulated data. Available estimands include:

Counterfactual Means (cfmean)
Counterfactual Differences (cfdiff)
Average Treatment Effect (ate), including among the Treated (att) and Untreated (atu)
Average Policy Effect (ape), also known as the causal effect of a Modified Treatment Policy.

In addition, one can simulate counterfactual outcomes directly using the draw_counterfactual function. Each of these is documented in detail in the following section. For low-level functions that can be used to approximate more complicated custom ground truth estimands in various settings, see Computing ground truth conditional distributions.

API

CausalTables.additive_mtp — Method

additive_mtp(δ)

Constructs a function that adds a constant (or constant vector) δ to the treatment variable(s) in a CausalTable object. This function is intended to be used as an argument to ape.

Arguments

δ: The "additive shift" to be applied to the treatment variable of a CausalTable.

Returns

A function that takes a CausalTable object as input and returns a column table of treatments that have been shifted by δ units.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Normal(L)),
    Y ~ @.(Normal(A + 2 * L + 1))
)
scm = StructuralCausalModel(dgp, :A, :Y)
ape(scm, additive_mtp(0.5))

source

CausalTables.ape — Method

ape(scm::StructuralCausalModel, intervention::Function; samples = 10^6)

Approximate the average policy effect for a given structural causal model (SCM), along with its efficiency bound. This is also known as the causal effect of a modified treatment policy, and is approximated using Monte Carlo sampling. Note that unless intervention is piecewise smooth invertible, the estimated statistical quantity may not have a causal interpretation; see Haneuse and Rotnizky (2013). Mathematically, this is

\[E(Y(d(a) - Y(a))\]

where $d(a)$ represents the intervention on the treatment variable(s) $A$, $Y(d(a))$ represents the counterfactual $Y$ under treatment $d(a)$, and $Y(a)$ represents the counterfactual outcome under the naturally observed value of treatment. This statistical quantity is approximated using Monte Carlo sampling.

Convenience functions for generating intervention functions include additive_mtp and multiplicative_mtp, which construct functions that respectively add or multiply a constant (or constant vector) to the treatment variable(s). One can also implement their own intervention function; this function must take as input a CausalTable object and return a NamedTuple object with each key indexing a treatment variable that has been modified according to the intervention. Also see cast_matrix_to_table_function for a convenience function for constructing interventions.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
intervention::Function: The intervention function to apply to the SCM.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The APE approximation.

Example

using Distributions
dgp = CausalTables.@dgp(
    L ~ Beta(2, 4),
    A ~ @.(Normal(L)),
    Y ~ @.(Normal(A + 2 * L + 1))
)
scm = CausalTables.StructuralCausalModel(dgp, :A, :Y)
ape(scm, additive_mtp(0.5))
ape(scm, multiplicative_mtp(2.0))

# example of a custom intervention function
custom_intervention = cast_matrix_to_table_function(x -> exp.(x))
ape(scm, custom_intervention)

source

CausalTables.ate — Method

ate(scm::StructuralCausalModel; samples = 10^6)

Approximate the average treatment effect (ATE) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is

\[E(Y(1) - Y(0))\]

where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The ATE approximation.
eff_bound: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
ate(scm)

source

CausalTables.att — Method

att(scm::StructuralCausalModel; samples = 10^6)

Approximate the average treatment effect among the treated (ATT) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is

\[E(Y(1) - Y(0) \mid A = 1)\]

where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The ATT approximation.
eff_bound: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
att(scm)

source

CausalTables.atu — Method

atu(scm::StructuralCausalModel; samples = 10^6)

Approximate the average treatment effect among the untreated (ATU) for a given structural causal model (SCM), along with its efficiency bound, for a univariate binary treatment. Mathematically, this is

\[E(Y(1) - Y(0) \mid A = 0)\]

where $Y(a)$ represents the counterfactual $Y$ had the treatment $A$ been set to $a$. This statistical quantity is approximated using Monte Carlo sampling.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The ATU approximation.
eff_bound: The variance of the counterfactual response, which is equal to the efficiency bound for IID data. If observations are correlated, this may not have a meaningful interpretation.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
atu(scm)

source

CausalTables.cast_matrix_to_table_function — Method

cast_matrix_to_table_function(func::Function)

Wraps a given function func that operates on a matrix and returns a new function that operates on a CausalTable object. The returned function converts the CausalTable's treatment matrix to a table, applies func to this matrix, and then converts the result back to a column table with the same header as the original treatment matrix.

Arguments

func::Function: A function that takes a matrix as input and returns a matrix.

Returns

A function that takes a CausalTable object as input and returns a column table.

Example

custom_intervention = cast_matrix_to_table_function(x -> exp.(x))

source

CausalTables.cfdiff — Method

cfdiff(scm::StructuralCausalModel, intervention1::Function, intervention2::Function; samples = 10^6)

Approximate the difference between two counterfactual response means – that under intervention1 having been applied to the treatment, and that under intervention2 – for a given structural causal model (SCM), along with its efficiency bound. Mathematically, this is

\[E(Y(d_1(a)) - Y(d_2(a)))\]

where $d_1$ and $d_2$ represent intervention1 and intervention2 being applied on the treatment variable(s) $A$. This statistical quantity is approximated using Monte Carlo sampling.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
intervention1::Function: The first intervention function to be contrasted.
intervention2::Function: The second intervention function to be contrasted.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The mean difference in counterfactual outcomes.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
cfdiff(scm, treat_all, treat_none)

source

CausalTables.cfmean — Method

cfmean(scm::StructuralCausalModel, intervention::Function; samples = 10^6)

Approximate the counterfactual mean of the response had intervention been applied to the treatment, along with its efficiency bound, for a given structural causal model (SCM). Mathematically, this estimand is

\[E(Y(d(a)))\]

where $d(a)$ represents an intervention on the treatment variable(s) $A$. This statistical quantity is approximated using Monte Carlo sampling.

Arguments

scm::StructuralCausalModel: The SCM from which data is to be simulated.
intervention::Function: The intervention function to apply to the SCM.
samples: The number of samples to draw from scm for Monte Carlo approximation (default is 10^6). This controls the precision of the approximation.

Returns

A named tuple containing:

μ: The mean of the counterfactual outcomes.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
cfmean(scm, treat_all)
cfmean(scm, treat_none)

source

CausalTables.draw_counterfactual — Method

draw_counterfactual(scm::StructuralCausalModel, parents::CausalTable, intervention::Function) -> Vector

Generate counterfactual responses based on a given structural causal model (SCM), a table of response parents, and an intervention function. That is, sample the responses that would have occurred had some intervention been applied to the treatment specified by the structural causal model.

Arguments

scm::StructuralCausalModel: The structural causal model used to generate counterfactual outcomes.
parents::CausalTable: A table containing the variables causally preceding the response variable.
intervention::Function: A function that defines the intervention to be applied to the parent variables. Use cast_matrix_to_table_function to convert a function acting on a treatment vector or matrix to a function that acts on a CausalTable.

Returns

A vector of counterfactual responses.

source

CausalTables.intervene — Method

intervene(ct::CausalTable, intervention::Function)

Applies intervention to the treatment vector(s) within a CausalTable, and outputs a new CausalTable with the intervened treatment.

Arguments

ct::CausalTable: The data on which treatment should be intervened
intervention::Function: A function that defines the intervention to be applied to the parent variables. Use cast_matrix_to_table_function to convert a function acting on a treatment vector or matrix to a function that acts on a CausalTable.

Returns

A CausalTable containing the same data as ct, but with the treatment variable(s) modified accoding to intervention

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
ct = rand(scm, 100)
intervene(ct, treat_all)

source

CausalTables.multiplicative_mtp — Method

multiplicative_mtp(δ)

Constructs a function that scales the treatment variable(s) in a CausalTable object by a constant δ. This function is intended to be used as an argument to ape.

Arguments

δ: The "multiplicative shift" to be applied to the treatment variable of a CausalTable.

Returns

A function that takes a CausalTable object as input and returns a column table of treatments that have been scaled by δ units.

Example

using Distributions
dgp = CausalTables.@dgp(
    L ~ Beta(2, 4),
    A ~ @.(Normal(L)),
    Y ~ @.(Normal(A + 2 * L + 1))
)
scm = CausalTables.StructuralCausalModel(dgp, :A, :Y)
ape(scm, multiplicative_mtp(2.0))

source

CausalTables.set_treatment_value — Method

set_treatment_value(ct::CausalTable, value::Float64)

Sets all treatments present in the data of a CausalTable to a specified value. This function is primarily used for interventions where the treatment value is set to a constant, such as in the case of binary treatments.

Arguments

ct::CausalTable: The causal table object containing treatment information and data.
value::Float64: (Currently unused) A float value intended to represent the treatment value to set.

Returns

NamedTuple: A named tuple mapping each treatment variable to a vector of ones.

source

CausalTables.treat_all — Method

treat_all(ct::CausalTable)

Intervenes on a CausalTable object by setting all treatment variables to 1.

Arguments

ct::CausalTable: A CausalTable object with a univariate binary treatment.

Returns

A NamedTuple object with the same header as the treatment matrix in ct, where each treatment variable is set to 1.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
data = rand(scm, 100)
treat_all(data)

source

CausalTables.treat_none — Method

treat_none(ct::CausalTable)

Intervenes on a CausalTable object by setting all treatment variables to 0.

Arguments

ct::CausalTable: A CausalTable object with a univariate binary treatment.

Returns

A NamedTuple object with the same header as the treatment matrix in ct, where each treatment variable is set to 0.

Example

using Distributions
dgp = @dgp(
    L ~ Beta(2, 4),
    A ~ @.(Bernoulli(L)),
    Y ~ @.(Normal(A + L))
)
scm = StructuralCausalModel(dgp, :A, :Y)
data = rand(scm, 100)
treat_none(data)

source