Difference in Differences with pymc models

Note

This example is in-progress! Further elaboration and explanation will follow soon.

import arviz as az

import causalpy as cp
%load_ext autoreload
%autoreload 2
# %config InlineBackend.figure_format = 'svg'
seed = 42

Load data

df = cp.load_data("did")
df.head()
group t unit post_treatment y
0 0 0.0 0 False 0.897122
1 0 1.0 0 True 1.961214
2 1 0.0 1 False 1.233525
3 1 1.0 1 True 2.752794
4 0 0.0 2 False 1.149207

Run the analysis

Note

The random_seed keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.

result = cp.pymc_experiments.DifferenceInDifferences(
    df,
    formula="y ~ 1 + group*post_treatment",
    time_variable_name="t",
    group_variable_name="group",
    model=cp.pymc_models.LinearRegression(sample_kwargs={"random_seed": seed}),
)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, sigma]
100.00% [8000/8000 00:01<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 1 seconds.
Sampling: [beta, sigma, y_hat]
Sampling: [y_hat]
Sampling: [y_hat]
Sampling: [y_hat]
/Users/benjamv/git/CausalPy/causalpy/pymc_experiments.py:366: FutureWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
  new_x.iloc[:, i] = 0
Sampling: [y_hat]
fig, ax = result.plot()
../_images/7f0117b9b1aeb6cf64043389111e5259262b91fb02d7204d61895a8c70915a62.png
result.summary()
===========================Difference in Differences============================
Formula: y ~ 1 + group*post_treatment

Results:
Causal impact = 0.50, $CI_{94\%}$[0.40, 0.60]
Model coefficients:
Intercept                     1.08, 94% HDI [1.03, 1.13]
post_treatment[T.True]        0.99, 94% HDI [0.92, 1.06]
group                         0.16, 94% HDI [0.09, 0.23]
group:post_treatment[T.True]  0.50, 94% HDI [0.40, 0.60]
sigma                         0.08, 94% HDI [0.07, 0.10]
ax = az.plot_posterior(result.causal_impact, ref_val=0, round_to=2)
ax.set(title="Posterior estimate of causal impact");
../_images/6b5f2b8901ad25a2be680a206763d7c040b64e5e8f6a4a32891a4902a16f1305.png