
expdpy is a Python library for interactive, exploratory analysis of panel and cross-sectional data. It pairs a set of composable analytical functions — that return interactive Plotly figures and publication-quality Great Tables — with two no-code ExPdPy web apps that bring the same workflow to the browser, no code required.
It is built on the modern Python data and econometrics stack:
- Plotly — interactive figures
- pyfixest — fixed-effects / clustered regressions
- Great Tables — publication-quality tables
- Streamlit and Shiny for Python — the two no-code
ExPdPyapps
Features
Composable analytical functions
Descriptive, correlation and extreme-observation tables; histograms and category bar charts; time trends and quantile trends; by-group bar, violin and trend views; scatter plots with an optional LOESS smoother; and a missing-value heatmap across the panel. Each function takes a pandas DataFrame and returns an interactive Plotly figure or a Great Tables object you can drop straight into a notebook or report.
Panel-aware econometrics
OLS with multi-way fixed effects and clustered standard errors via pyfixest, publication-ready coefficient tables, and Frisch–Waugh–Lovell partial-regression plots that show a single coefficient net of all controls and fixed effects. Winsorize or truncate outliers with treat_outliers.
Two no-code apps — Streamlit & Shiny
The same exploration workflow in two frontends: a sidebar sample pipeline (subset filters, outlier treatment), component selection and ordering, and live, point-and-click analysis. The Streamlit app organises the components into a multipage layout with native, sortable tables and deploys to Streamlit Community Cloud in one click; the Shiny app stacks every component in one scrolling view. See Streamlit vs Shiny for a side-by-side comparison.
Reproducibility & safety
Any in-app exploration exports to a runnable bundle — a Jupyter notebook, a .py script and the prepared data (parquet) — that recreates every displayed result with expdpy calls. Analysis configurations save, load and interchange between the two apps. New variables can be defined live through a restricted-AST expression evaluator (never eval/exec) with panel-aware lag/lead that shift within each cross-section.
Bundled panel datasets
expdpy.data ships ready-to-explore panels — kuznets (the flagship N-shaped Kuznets-curve demo) and gapminder — with kuznets shipping a preset configuration that opens an app directly on the worked example. See the kuznets dataset page for the data dictionary.
Installation
The package is not on PyPI yet — install the latest version straight from GitHub:
# Core analytical functions:
pip install "git+https://github.com/cmg777/expdpy.git"
# ...with the interactive ExPdPy app (Streamlit):
pip install "expdpy[streamlit] @ git+https://github.com/cmg777/expdpy.git"
# ...with the interactive ExPdPy app (Shiny):
pip install "expdpy[app] @ git+https://github.com/cmg777/expdpy.git"Using uv:
uv pip install "git+https://github.com/cmg777/expdpy.git"
uv pip install "expdpy[streamlit] @ git+https://github.com/cmg777/expdpy.git" # Streamlit
uv pip install "expdpy[app] @ git+https://github.com/cmg777/expdpy.git" # ShinyPin to a branch, tag, or commit for reproducible installs:
pip install "git+https://github.com/cmg777/expdpy.git@main"
# pip install "git+https://github.com/cmg777/expdpy.git@v0.1.0" # once a release is taggedComing soon (PyPI): once published, pip install expdpy / pip install "expdpy[streamlit]" / pip install "expdpy[app]" will work directly.
At a glance
The lead example throughout these docs is the bundled kuznets panel (80 countries × 2015–2025): a synthetic dataset, rich in control variables, whose regional inequality traces an N-shaped Kuznets curve in income — it rises, falls, then rises again at very high income.
import expdpy as ex
from expdpy.data import load_kuznets
df = load_kuznets()
# The N-shaped regional Kuznets curve: regional inequality vs (log) GDP per capita
ex.prepare_scatter_plot(
df, x="log_gdp_pc", y="gini_regional", color="continent", size="population", loess=1
)Launch the same data in the interactive app, pre-configured to open on the curve:
from expdpy.streamlit_app import ExPdPy # or: from expdpy.app import ExPdPy
from expdpy.data import load_kuznets, load_kuznets_data_def, get_config
ExPdPy(load_kuznets(), df_def=load_kuznets_data_def(), config_list=get_config("kuznets"))Head to the Quickstart to see every function in action, the kuznets dataset page for the data dictionary, or the Streamlit / Shiny guides to launch the interactive apps.
Acknowledgement
expdpy began as a Python port of the excellent ExPanDaR R package by Joachim Gassen and the TRR 266 Accounting for Transparency project, and its foundations remain deeply inspired by that work . Over time, expdpy has grown to include more functionality and it will keep evolving.
We are grateful to the ExPanDaR authors. Please cite the original work when using expdpy in research (see CITATION.cff).