prepare_ext_obs_table
prepare_ext_obs_table(df, n=5, cs_id=None, ts_id=None, var=None, *, digits=3)Display the top and bottom n observations sorted by var.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df | pd.DataFrame | Data frame. | required |
| n | int | Number of extreme observations on each side (2 * n <= len(df)). |
5 |
| cs_id | Sequence[str] | str | None | Cross-sectional identifier column(s). If both cs_id and ts_id are omitted, all variables are tabulated; otherwise only the identifiers and var. |
None |
| ts_id | str | None | Time-series identifier column. | None |
| var | str | None | Variable to sort by. Defaults to the last numeric column that is not an identifier. | None |
| digits | int | Number of decimals for numeric cells. | 3 |
Returns
| Name | Type | Description |
|---|---|---|
| ExtObsTableResult | df (the 2 * n extreme rows) and gt (a Great Tables object with a separator row between the top and bottom blocks). |
Examples
Basic — the five highest and lowest observations (sorted by the last numeric column), tabulating all variables:
import expdpy as ex
from expdpy.data import load_kuznets
df = load_kuznets()
ex.prepare_ext_obs_table(df, n=5).gtAdvanced — the ten most extreme observations of a chosen variable, showing only the panel identifiers and that variable:
ex.prepare_ext_obs_table(
df, n=10, cs_id=["country"], ts_id="year", var="gini_regional"
).gt