Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: amazon-science/causal-validation
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.0.5
Choose a base ref
...
head repository: amazon-science/causal-validation
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
  • 15 commits
  • 35 files changed
  • 5 contributors

Commits on Sep 6, 2024

  1. Docs (#7)

    * Add docs workflow
    
    * Add docs workflow
    
    * Add docs workflow
    
    * Update Index
    
    * Add build workflow
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 6, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    8c35c92 View commit details
  2. Change gh-pages deploy (#8)

    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 6, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    3e5b634 View commit details
  3. Gh pages deploy (#9)

    * Change gh-pages deploy
    
    * Alter workflow
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 6, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    175bb0f View commit details
  4. Build nbs (#10)

    * Build nbs
    
    * Build nbs
    
    * Build nbs
    
    * Build nbs
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 6, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    5c58a9b View commit details
  5. Fix build (#11)

    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 6, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    fc7503a View commit details

Commits on Sep 9, 2024

  1. Add installation (#12)

    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 9, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    1cbb1e1 View commit details
  2. Styling (#13)

    * Fix docs
    
    * Fix docs
    
    * Fix docs
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 9, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    7507996 View commit details
  3. Fix docs (#14)

    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 9, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    167efb4 View commit details

Commits on Sep 10, 2024

  1. Multiple Datasets Placebo Testing (#15)

    * This PR adds support for multiple datasets to be passed to the
    `PlaceboTesting` object.
    
    * Bump to v0.0.7
    
    * Fix ruff issues
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Sep 10, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    2657075 View commit details

Commits on Sep 13, 2024

  1. Minor change in README to fix guidance for developers (#18)

    semihakbayrak authored Sep 13, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    cf8f19a View commit details

Commits on Sep 18, 2024

  1. Noise transform (#19)

    * Add noise transformation that apply perturbations on treatment
    
    * Formatting
    
    * Add docstring
    
    * Fix linting
    
    * Add tests to check perturbation impact and randomness over timepoints
    semihakbayrak authored Sep 18, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    ae659c1 View commit details
  2. bump version (#20)

    thomaspinder authored Sep 18, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    8b44c71 View commit details

Commits on Sep 24, 2024

  1. Rmspe test stat (#24)

    * RMSPE WIP
    
    * Rmspe test stat (#22)
    
    * Minor change in README to fix guidance for developers (#18)
    
    * Noise transform (#19)
    
    * Add noise transformation that apply perturbations on treatment
    
    * Formatting
    
    * Add docstring
    
    * Fix linting
    
    * Add tests to check perturbation impact and randomness over timepoints
    
    * bump version (#20)
    
    * Initial implementation of RMSPE
    
    * Add TestResultFrame parent class for test results
    
    * Add test for RMSPE
    
    * Add doc string
    
    * Fix linting
    
    * Update src/causal_validation/validation/rmspe.py
    
    Co-authored-by: Thomas Pinder <[email protected]>
    
    * Fix typo
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    
    ---------
    
    Co-authored-by: Thomas Pinder <[email protected]>
    Co-authored-by: Semih Akbayrak <[email protected]>
    3 people authored Sep 24, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    8cfdcfa View commit details

Commits on Oct 23, 2024

  1. Change noise field to default_factory (#26)

    Co-authored-by: Thomas Pinder <[email protected]>
    thomaspinder and Thomas Pinder authored Oct 23, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    311dbd4 View commit details

Commits on Oct 24, 2024

  1. Small doc fixes (#25)

    B-Deforce authored Oct 24, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    7b4fb63 View commit details
Showing with 1,954 additions and 464 deletions.
  1. +45 −0 .github/workflows/docs.yml
  2. +35 −0 .github/workflows/test_docs.yml
  3. +2 −1 .github/workflows/tests.yml
  4. +2 −1 .gitignore
  5. +21 −0 .pre-commit-config.yaml
  6. +11 −5 README.md
  7. +205 −0 docs/examples/azcausal.ipynb
  8. +338 −0 docs/examples/basic.ipynb
  9. +237 −0 docs/examples/placebo_test.ipynb
  10. +58 −0 docs/index.md
  11. +42 −0 docs/installation.md
  12. +19 −0 docs/javascripts/mathjax.js
  13. BIN docs/static/imgs/readme_fig.png
  14. +5 −0 docs/stylesheets/extra.css
  15. +0 −113 examples/azcausal.pct.py
  16. +0 −169 examples/basic.pct.py
  17. +0 −113 examples/placebo_test.pct.py
  18. +70 −0 mkdocs.yml
  19. +25 −4 pyproject.toml
  20. +1 −1 src/causal_validation/__about__.py
  21. +50 −3 src/causal_validation/data.py
  22. +40 −2 src/causal_validation/models.py
  23. +2 −1 src/causal_validation/transforms/__init__.py
  24. +6 −2 src/causal_validation/transforms/base.py
  25. +32 −0 src/causal_validation/transforms/noise.py
  26. +2 −0 src/causal_validation/types.py
  27. +68 −39 src/causal_validation/validation/placebo.py
  28. +133 −0 src/causal_validation/validation/rmspe.py
  29. +107 −0 src/causal_validation/validation/testing.py
  30. +0 −1 static/fig_creation.py
  31. +63 −0 tests/test_causal_validation/test_data.py
  32. +12 −4 tests/test_causal_validation/test_models.py
  33. +127 −0 tests/test_causal_validation/test_transforms/test_noise.py
  34. +27 −5 tests/test_causal_validation/test_validation/test_placebo.py
  35. +169 −0 tests/test_causal_validation/test_validation/test_rmspe.py
45 changes: 45 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Build Documentation

on:
push:
branches:
- main
tags:
- "**"
workflow_dispatch:

permissions:
contents: write

jobs:
build-docs:
concurrency: ci-${{ github.ref }}
name: Build docs
runs-on: "ubuntu-latest"
defaults:
run:
shell: bash -l {0}

steps:
# Grap the latest commit from the branch
- name: Checkout the branch
uses: actions/checkout@v3.5.2
with:
persist-credentials: false

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11

- name: Install Hatch
uses: pypa/hatch@install

- name: Build and deploy the documentation
run: hatch run docs:deploy

- name: Deploy Page 🚀
uses: JamesIves/github-pages-deploy-action@v4.4.1
with:
branch: gh-pages
folder: site
35 changes: 35 additions & 0 deletions .github/workflows/test_docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Test Documentation

on:
pull_request:
workflow_dispatch:

permissions:
contents: write

jobs:
test-docs:
concurrency: ci-${{ github.ref }}
name: Test docs
runs-on: "ubuntu-latest"
defaults:
run:
shell: bash -l {0}

steps:
# Grap the latest commit from the branch
- name: Checkout the branch
uses: actions/checkout@v3.5.2
with:
persist-credentials: false

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11

- name: Install Hatch
uses: pypa/hatch@install

- name: Build the documentation
run: hatch run docs:build
3 changes: 2 additions & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -13,13 +13,14 @@ jobs:
matrix:
# Select the Python versions to test against
os: ["ubuntu-latest", "macos-latest"]
python-version: ["3.10", "3.11"]
python-version: ["3.10", "3.11", "3.12"]
fail-fast: true
steps:
- name: Check out the code
uses: actions/checkout@v3.5.2
with:
fetch-depth: 1

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -147,4 +147,5 @@ scratch_nbs/
.DS_store
package.json
package-lock.json
node_modules/
node_modules/
docs/_examples
21 changes: 21 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
repos:
# python code formatting
- repo: https://github.com/psf/black
rev: 23.12.1
hooks:
- id: black
args: ["--config", "pyproject.toml"]

# python import sorting
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort
args: ["--settings-path", "pyproject.toml"]

# remove notebook cell output
- repo: https://github.com/kynan/nbstripout
rev: 0.7.1
hooks:
- id: nbstripout
files: ".ipynb"
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Causal Validation

This package provides functionality to define your own causal data generation process and then simulate data from the process. Within the package, there is functionality to include complex components to your process, such as periodic and temporal trends, and all of these operations are fully composable with one another.
This package provides functionality to define your own causal data generation process
and then simulate data from the process. Within the package, there is functionality to
include complex components to your process, such as periodic and temporal trends, and
all of these operations are fully composable with one another.

A short example is given below
```python
@@ -38,9 +41,12 @@ plot(inflated_data)

## Examples

To supplement the above example, we have two more detailed notebooks which exhaustively present and explain the functionalty in this package, along with how the generated data may be integrated with [AZCausal](https://github.com/amazon-science/azcausal).
1. [Basic notebook](): We here show the full range of available functions for data generation
2. [AZCausal notebook](): We here show how the generated data may be used within an AZCausal model.
To supplement the above example, we have two more detailed notebooks which exhaustively
present and explain the functionalty in this package, along with how the generated data
may be integrated with [AZCausal](https://github.com/amazon-science/azcausal).
1. [Data Synthesis](https://amazon-science.github.io/causal-validation/examples/basic/): We here show the full range of available functions for data generation.
2. [Placebo testing](https://amazon-science.github.io/causal-validation/examples/placebo_test/): Validate your model(s) using placebo tests.
3. [AZCausal notebook](https://amazon-science.github.io/causal-validation/examples/azcausal/): We here show how the generated data may be used within an AZCausal model.

## Installation

@@ -70,4 +76,4 @@ in your terminal.
1. Follow steps 1-3 from `For Users`
2. Create a hatch environment `hatch env create`
3. Open a hatch shell `hatch shell`
4. Validate your installation by running `hatch run tests:test`
4. Validate your installation by running `hatch run dev:test`
205 changes: 205 additions & 0 deletions docs/examples/azcausal.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {},
"source": [
"# AZCausal Integration\n",
"\n",
"Amazon's [AZCausal](https://github.com/amazon-science/azcausal) library provides the\n",
"functionality to fit synthetic control and difference-in-difference models to your\n",
"data. Integrating the synthetic data generating process of `causal_validation` with\n",
"AZCausal is trivial, as we show in this notebook. To start, we'll simulate a toy\n",
"dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1",
"metadata": {},
"outputs": [],
"source": [
"from azcausal.estimators.panel.sdid import SDID\n",
"import scipy.stats as st\n",
"\n",
"from causal_validation import (\n",
" Config,\n",
" simulate,\n",
")\n",
"from causal_validation.effects import StaticEffect\n",
"from causal_validation.plotters import plot\n",
"from causal_validation.transforms import (\n",
" Periodic,\n",
" Trend,\n",
")\n",
"from causal_validation.transforms.parameter import UnitVaryingParameter"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2",
"metadata": {},
"outputs": [],
"source": [
"cfg = Config(\n",
" n_control_units=10,\n",
" n_pre_intervention_timepoints=60,\n",
" n_post_intervention_timepoints=30,\n",
" seed=123,\n",
")\n",
"\n",
"linear_trend = Trend(degree=1, coefficient=0.05)\n",
"data = linear_trend(simulate(cfg))\n",
"ax = plot(data)"
]
},
{
"cell_type": "markdown",
"id": "3",
"metadata": {
"title": "We'll now simulate a 5% lift in the treatment group's observations. This"
},
"source": [
"will inflate the treated group's observations in the post-intervention window."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4",
"metadata": {},
"outputs": [],
"source": [
"TRUE_EFFECT = 0.05\n",
"effect = StaticEffect(effect=TRUE_EFFECT)\n",
"inflated_data = effect(data)\n",
"ax = plot(inflated_data)"
]
},
{
"cell_type": "markdown",
"id": "5",
"metadata": {},
"source": [
"## Fitting a model\n",
"\n",
"We now have some very toy data on which we may apply a model. For this demonstration\n",
"we shall use the Synthetic Difference-in-Differences model implemented in AZCausal;\n",
"however, the approach shown here will work for any model implemented in AZCausal. To\n",
"achieve this, we must first coerce the data into a format that is digestible for\n",
"AZCausal. Through the `.to_azcausal()` method implemented here, this is\n",
"straightforward to achieve. Once we have a AZCausal compatible dataset, the modelling\n",
"is very simple by virtue of the clean design of AZCausal."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6",
"metadata": {},
"outputs": [],
"source": [
"panel = inflated_data.to_azcausal()\n",
"model = SDID()\n",
"result = model.fit(panel)\n",
"print(f\"Delta: {TRUE_EFFECT - result.effect.percentage().value / 100}\")\n",
"print(result.summary(title=\"Synthetic Data Experiment\"))"
]
},
{
"cell_type": "markdown",
"id": "7",
"metadata": {
"title": "We see that SDID has done an excellent job of estimating the treatment"
},
"source": [
"We see that SDID has done an excellent job of estimating the treatment effect. However, given the simplicity of the data, this is not surprising. With the\n",
"functionality within this package though we can easily construct more complex datasets\n",
"in effort to fully stress-test any new model and identify its limitations.\n",
"\n",
"To achieve this, we'll simulate 10 control units, 60 pre-intervention time points, and\n",
"30 post-intervention time points according to the following process: \n",
"\n",
"$$ \\begin{align}\n",
"\\mu_{n, t} & \\sim\\mathcal{N}(20, 0.5^2)\\\\\n",
"\\alpha_{n} & \\sim \\mathcal{N}(0, 1^2)\\\\\n",
"\\beta_{n} & \\sim \\mathcal{N}(0.05, 0.01^2)\\\\\n",
"\\nu_n & \\sim \\mathcal{N}(1, 1^2)\\\\\n",
"\\gamma_n & \\sim \\operatorname{Student-t}_{10}(1, 1^2)\\\\\n",
"\\mathbf{Y}_{n, t} & = \\mu_{n, t} + \\alpha_{n} + \\beta_{n}t + \\nu_n\\sin\\left(3\\times\n",
"2\\pi t + \\gamma\\right) + \\delta_{t, n} \\end{align} $$ \n",
"\n",
"where the true treatment effect\n",
"$\\delta_{t, n}$ is 5% when $n=1$ and $t\\geq 60$ and 0 otherwise. Meanwhile,\n",
"$\\mathbf{Y}$ is the matrix of observations, long in the number of time points and wide\n",
"in the number of units."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8",
"metadata": {},
"outputs": [],
"source": [
"cfg = Config(\n",
" n_control_units=10,\n",
" n_pre_intervention_timepoints=60,\n",
" n_post_intervention_timepoints=30,\n",
" global_mean=20,\n",
" global_scale=1,\n",
" seed=123,\n",
")\n",
"\n",
"intercept = UnitVaryingParameter(sampling_dist=st.norm(loc=0.0, scale=1))\n",
"coefficient = UnitVaryingParameter(sampling_dist=st.norm(loc=0.05, scale=0.01))\n",
"linear_trend = Trend(degree=1, coefficient=coefficient, intercept=intercept)\n",
"\n",
"amplitude = UnitVaryingParameter(sampling_dist=st.norm(loc=1.0, scale=2))\n",
"shift = UnitVaryingParameter(sampling_dist=st.t(df=10))\n",
"periodic = Periodic(amplitude=amplitude, shift=shift, frequency=3)\n",
"\n",
"data = effect(periodic(linear_trend(simulate(cfg))))\n",
"ax = plot(data)"
]
},
{
"cell_type": "markdown",
"id": "9",
"metadata": {},
"source": [
"As before, we may now go about estimating the treatment. However, this time we see that the delta between the estimated and true effect is much larger than\n",
"before."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10",
"metadata": {},
"outputs": [],
"source": [
"panel = data.to_azcausal()\n",
"model = SDID()\n",
"result = model.fit(panel)\n",
"print(f\"Delta: {100*(TRUE_EFFECT - result.effect.percentage().value / 100): .2f}%\")\n",
"print(result.summary(title=\"Synthetic Data Experiment\"))"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "title,-all",
"main_language": "python",
"notebook_metadata_filter": "-all"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading