`causalpy.pymc_models`

Defines generic PyMC ModelBuilder class and subclasses for

WeightedSumFitter model for Synthetic Control experiments
LinearRegression model

Models are intended to be used from inside an experiment class (see PyMC experiments). This is why the examples require some extra manipulation input data, often to ensure y has the correct shape.

class causalpy.pymc_models.InstrumentalVariableRegression

Custom PyMC model for instrumental linear regression

Example

>>> import causalpy as cp
>>> import numpy as np
>>> from causalpy.pymc_models import InstrumentalVariableRegression
>>> N = 10
>>> e1 = np.random.normal(0, 3, N)
>>> e2 = np.random.normal(0, 1, N)
>>> Z = np.random.uniform(0, 1, N)
>>> ## Ensure the endogeneity of the the treatment variable
>>> X = -1 + 4 * Z + e2 + 2 * e1
>>> y = 2 + 3 * X + 3 * e1
>>> t = X.reshape(10,1)
>>> y = y.reshape(10,1)
>>> Z = np.asarray([[1, Z[i]] for i in range(0,10)])
>>> X = np.asarray([[1, X[i]] for i in range(0,10)])
>>> COORDS = {'instruments': ['Intercept', 'Z'], 'covariates': ['Intercept', 'X']}
>>> sample_kwargs = {
...     "tune": 5,
...     "draws": 10,
...     "chains": 2,
...     "cores": 2,
...     "target_accept": 0.95,
...     "progressbar": False,
... }
>>> iv_reg = InstrumentalVariableRegression(sample_kwargs=sample_kwargs)
>>> iv_reg.fit(X, Z,y, t, COORDS, {
...                  "mus": [[-2,4], [0.5, 3]],
...                  "sigmas": [1, 1],
...                  "eta": 2,
...                  "lkj_sd": 2,
...              })
Inference data...

build_model(X, Z, y, t, coords, priors)

Specify model with treatment regression and focal regression data and priors

Parameters:

X – A pandas dataframe used to predict our outcome y
Z – A pandas dataframe used to predict our treatment variable t
y – An array of values representing our focal outcome y
t – An array of values representing the treatment t of which we’re interested in estimating the causal impact
coords – A dictionary with the coordinate names for our instruments and covariates
priors – An optional dictionary of priors for the mus and sigmas of both regressions priors = {"mus": [0, 0], "sigmas": [1, 1], "eta": 2, "lkj_sd": 2}

fit(X, Z, y, t, coords, priors): Draw samples from posterior, prior predictive, and posterior predictive distributions.

class causalpy.pymc_models.LinearRegression

Custom PyMC model for linear regression

Defines the PyMC model

\[ \begin{align}\begin{aligned}\beta &\sim \mathrm{Normal}(0, 50)\\\sigma &\sim \mathrm{HalfNormal}(1)\\\mu &= X * \beta\\y &\sim \mathrm{Normal}(\mu, \sigma)\end{aligned}\end{align} \]

Example

>>> import causalpy as cp
>>> import numpy as np
>>> from causalpy.pymc_models import LinearRegression
>>> rd = cp.load_data("rd")
>>> X = rd[["x", "treated"]]
>>> y = np.asarray(rd["y"]).reshape((rd["y"].shape[0],1))
>>> lr = LinearRegression(sample_kwargs={"progressbar": False})
>>> lr.fit(X, y, coords={
...                 'coeffs': ['x', 'treated'],
...                 'obs_indx': np.arange(rd.shape[0])
...                },
... )
Inference...

build_model(X, y, coords): Defines the PyMC model

class causalpy.pymc_models.ModelBuilder

This is a wrapper around pm.Model to give scikit-learn like API.

Public Methods

build_model: must be implemented by subclasses
fit: populates idata attribute
predict: returns predictions on new data
score: returns Bayesian \(R^2\)

Example

>>> import causalpy as cp
>>> import numpy as np
>>> import pymc as pm
>>> from causalpy.pymc_models import ModelBuilder
>>> class MyToyModel(ModelBuilder):
...     def build_model(self, X, y, coords):
...         with self:
...             X_ = pm.Data(name="X", value=X)
...             y_ = pm.Data(name="y", value=y)
...             beta = pm.Normal("beta", mu=0, sigma=1, shape=X_.shape[1])
...             sigma = pm.HalfNormal("sigma", sigma=1)
...             mu = pm.Deterministic("mu", pm.math.dot(X_, beta))
...             pm.Normal("y_hat", mu=mu, sigma=sigma, observed=y_)
>>> rng = np.random.default_rng(seed=42)
>>> X = rng.normal(loc=0, scale=1, size=(20, 2))
>>> y = rng.normal(loc=0, scale=1, size=(20,))
>>> model = MyToyModel(
...             sample_kwargs={
...                 "chains": 2,
...                 "draws": 2000,
...                 "progressbar": False,
...                 "random_seed": rng,
...             }
... )
>>> model.fit(X, y)
Inference...
>>> X_new = rng.normal(loc=0, scale=1, size=(20,2))
>>> model.predict(X_new)
Inference...
>>> model.score(X, y) 
r2        0.3
r2_std    0.0
dtype: float64

__init__(sample_kwargs=None)

Parameters:: sample_kwargs (Optional[Dict[str, Any]]) – A dictionary of kwargs that get unpacked and passed to the pymc.sample() function. Defaults to an empty dictionary.

build_model(X, y, coords)

Build the model, must be implemented by subclass.

Return type:: None

fit(X, y, coords=None)

Draw samples fromposterior, prior predictive, and posterior predictive distributions, placing them in the model’s idata attribute.

Return type:: None
Parameters:: coords (Dict[str, Any] | None)

predict(X): Predict data given input data X

Caution

Results in KeyError if model hasn’t been fit.

score(X, y)

Score the Bayesian \(R^2\) given inputs X and outputs y. :rtype: Series

Caution

The Bayesian \(R^2\) is not the same as the traditional coefficient of determination, https://en.wikipedia.org/wiki/Coefficient_of_determination.

Return type:: Series

class causalpy.pymc_models.WeightedSumFitter

Used for synthetic control experiments

Note

Generally, the .fit() method should be used rather than calling .build_model() directly.

Defines the PyMC model:

\[ \begin{align}\begin{aligned}\sigma &\sim \mathrm{HalfNormal}(1)\\\beta &\sim \mathrm{Dirichlet}(1,...,1)\\\mu &= X * \beta\\y &\sim \mathrm{Normal}(\mu, \sigma)\end{aligned}\end{align} \]

Example

>>> import causalpy as cp
>>> import numpy as np
>>> from causalpy.pymc_models import WeightedSumFitter
>>> sc = cp.load_data("sc")
>>> X = sc[['a', 'b', 'c', 'd', 'e', 'f', 'g']]
>>> y = np.asarray(sc['actual']).reshape((sc.shape[0], 1))
>>> wsf = WeightedSumFitter(sample_kwargs={"progressbar": False})
>>> wsf.fit(X,y)
Inference ...

build_model(X, y, coords): Defines the PyMC model

causalpy.pymc_models

Example

Example

Public Methods

Example

Example

`causalpy.pymc_models`