# Difference in Differences with `pymc`

models#

Note

This example is in-progress! Further elaboration and explanation will follow soon.

```
import arviz as az
import causalpy as cp
```

```
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'
seed = 42
```

## Load data#

```
df = cp.load_data("did")
df.head()
```

group | t | unit | post_treatment | y | |
---|---|---|---|---|---|

0 | 0 | 0.0 | 0 | False | 0.897122 |

1 | 0 | 1.0 | 0 | True | 1.961214 |

2 | 1 | 0.0 | 1 | False | 1.233525 |

3 | 1 | 1.0 | 1 | True | 2.752794 |

4 | 0 | 0.0 | 2 | False | 1.149207 |

## Run the analysis#

Note

The `random_seed`

keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.

```
result = cp.DifferenceInDifferences(
df,
formula="y ~ 1 + group*post_treatment",
time_variable_name="t",
group_variable_name="group",
model=cp.pymc_models.LinearRegression(sample_kwargs={"random_seed": seed}),
)
```

```
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [beta, sigma]
```

```
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 1 seconds.
Sampling: [beta, sigma, y_hat]
Sampling: [y_hat]
Sampling: [y_hat]
Sampling: [y_hat]
Sampling: [y_hat]
```

```
result.summary()
```

```
===========================Difference in Differences============================
Formula: y ~ 1 + group*post_treatment
Results:
Causal impact = 0.50$CI_{94\%}$[0.4, 0.6]
Model coefficients:
Intercept 1.1, 94% HDI [1, 1.1]
post_treatment[T.True] 0.99, 94% HDI [0.92, 1.1]
group 0.16, 94% HDI [0.094, 0.23]
group:post_treatment[T.True] 0.5, 94% HDI [0.4, 0.6]
sigma 0.082, 94% HDI [0.066, 0.1]
```