API Reference#
This page provides an auto-generated summary of intake-esm’s API. For more details and examples, refer to the relevant chapters in the main part of the documentation.
data_access#
This module provides utilities which support input/output processes. Functions in this module can provide methods to return dictionaries of filepaths keyed by initialization year, nested lists of files for particular start years and ensemble members, and dask arrays containing particular hindcast ensembles. This module also provides preprocessing which can assist in using intake-esm in conjunction with other data_access functions.
Use#
Users wishing to utilize these tools may do so by importing various functions, for example:
from esp-tools.utils.io_utils import file_dict
Dependencies#
The user must have an activated conda environment which includes xarray, numpy, glob, and functools.
- esp_lab.data_access.file_dict(filetempl, filetype, mem, stmon)[source]
Returns a dictionary of filepaths keyed by initialization year, for a given experiment, field, ensemble member, and initialization month
- Parameters
filetempl (str) – file template
filetype (str) – file ending
mem (int) – ensemble member
stmon (int) – month
- Returns
filepaths (dict) – dictionary containing filepaths keyed by initialization year
- esp_lab.data_access.get_monthly_data(filetemplate, filetype, ens, nlead, field, start_years, stmon, preproc, chunks={})[source]
Returns a dask array containing the requested hindcast ensemble.
- Parameters
nfiletemplate (str) – file template
filetype (str) – file ending
ens (int) – ensemble member
nlead (int) – number of months over which data is read; allows for a partial read of the data and controls the time dimension of returned dask array
field (str) – variable to be examined, eg ‘TREFHT’
startyears (list) – list of start years which are integers
stmon (str) – month
preproc (func) – preprocessing function
chunks (dict) – chunks for dask array, defaults to {}
- Returns
ds0 (dask array) – dask array containing requested hindcast ensemble
- esp_lab.data_access.nested_file_list_by_year(filetemplate, filetype, ens, start_years, stmon)[source]
Retrieves a nested list of files for these start years and ensemble members
- Parameters
filetemplate (str) – file template
filetype (str) – file ending
ens (int) – ensemble member
start_years (list) – list of start years which are integers
stmon (str) – month
- Returns
nested_files (list) – nested list of files
- esp_lab.data_access.preprocessor(ds0, nlead, field)[source]
This preprocessor is applied on an individual timeseries file basis. It will return a monthly mean CAM field with centered time coordinate. Edit this appropriately for your analysis to speed up processing.
- Parameters
ds0 (xarray) – timeseries xarray dataset that requires preprocessing
nlead (int) – number of months over which data is read; allows for a partial read of the data and controls the time dimension of returned dask array
field (str) – variable to be examined, eg ‘TREFHT’
- Returns
d0 (xarray) – xarray dataset of monthly mean CAM field with centered time coordinate
- esp_lab.data_access.time_set_midmonth(ds, time_name)[source]
Return copy of ds with values of ds[time_name] replaced with mid-month values (day=15) rather than end-month values.
- Parameters
ds (xarray) – xarray dataset which currently has end month values that will be replaced with mid month values
time_name (str) – name of time component, eg ‘time’
- Returns
ds (xarray) – xarray dataset with end month values replaced with mid month values
stats#
This module provides utilities to assist in statistics calculations related to SMYLE analysis. Functions provide tools to perform linear detrending along a particular axis, determine skill metrics based on model and observation DataArrays, and generate a distribution of skill scores using a smaller ensemble member size.
Use#
Users wishing to utilize these tools may do so by importing various functions, for example:
from esp-tools.utils.stat_utils import cor_ci_bootyears
Dependencies#
The user must have an activated conda environment which includes xarray, numpy, sys, cftime, and xskillscore.
- esp_lab.stats.compute_resampskill_annual(mod_da, mod_time, obs_da, nleadavg=1, nleads=1, detrend=False, resamp=0, mean=True)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should contain annual fields.
Unlike compute_skill_annual(), this version operates on a mod_da input that has already been resampled across the member dimension (M) such that it has an ‘iteration’ dimension. Returns the resampled skill score distribution (or the mean of the skill score distribution if mean==True).
- Parameters
mod_da (DataArray) – a annually-averaged (de-drifted) hindcast DataArray dimensioned (Y,L,M,…). Assumes ‘iteration’ dimension.
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). Assumes year values as int or float.
obs_da (DataArray) – a annually-averaged OBS DataArray dimensioned (time,…)
nleadavg (int (optional)) – sets temporal smoothing (e.g., nleadavg=3 to verify 3-year average fields).
nleads (int (optional)) – number of leads to include in skill computation (e.g., nleadavg=3,nleads=2 will return metrics for FY1-3, FY2-4)
resamp (bool (optional)) – number of resamplings of individual-member timeseries for computing forecast variance.
detrend (bool (optional)) – defaults to False; if set to True, skill scores will be computed after detrending
mean (bool (optional)) – set to False to return full resampled skill score distribution
- Returns
dsout (DataArray) – set of skill score metrics
- esp_lab.stats.compute_resampskill_seasonal(mod_da, mod_time, obs_da, climy0, climy1, nleadavg=1, nleads=1, detrend=False, resamp=0, mean=True, monthly=False)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should contain either monthly or 3monthseason-average fields.
Unlike compute_skill_annual(), this version operates on a mod_da input that has already been resampled across the member dimension (M) such that it has an ‘iteration’ dimension. Returns the resampled skill score distribution (or the mean of the skill score distribution if mean==True).
- Parameters
mod_da (DataArray) – a monthly or seasonally-averaged (de-drifted) hindcast DataArray dimensioned (Y,L,M,…). Assumes ‘iteration’ dimension.
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). Assumes mod_time.dt.month & mod_time.dt.year exist.
obs_da (DataArray) – a monthly or seasonally-averaged OBS DataArray dimensioned (time,…)
climy0 (int) – start year of climatology for computing anomalies
climy1 (int) – end year of climatology for computing anomalies
nleadavg (int (optional)) – sets temporal smoothing (e.g., nleadavg=3 to verify 3-year average fields).
nleads (int (optional)) – number of leads to include in skill computation (e.g., nleadavg=3,nleads=2 will return metrics for FY1-3, FY2-4)
resamp (bool (optional)) – number of resamplings of individual-member timeseries for computing forecast variance.
detrend (bool (optional)) – defaults to False; if set to True, skill scores will be computed after detrending
mean (bool (optional)) – set to False to return full resampled skill score distribution
monthly (bool (optional)) – set to True if mod_da and obs_da are monthly means (skill will be computed for each lead month instead of each lead season)
- Returns
dsout (DataArray) – set of skill score metrics
- esp_lab.stats.compute_skill_annual(mod_da, mod_time, obs_da, nleadavg=1, nleads=1, resamp=0, detrend=False)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should contain annual-average fields.
- Parameters
mod_da (DataArray) – an annually-averaged hindcast DataArray dimensioned (Y,L,M,…)
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). Assumes year values as int or float.
obs_da (DataArray) – an annually-averaged OBS DataArray dimensioned (time,…)
nleadavg (int (optional)) – permits additional temporal smoothing (e.g., nleadavg=3 to verify 3-year average hindcasts).
nleads (int (optional)) – number of leads to include in skill computation (e.g., nleadavg=3,nleads=2 will return metrics for: FY1-3, FY2-4)
resamp (bool (optional)) – number of resamplings of individual-member timeseries for computing forecast variance.
detrend (bool (optional)) – defaults to False; if set to True, skill scores will be computed after detrending
- Returns
dsout (DataArray) – set of skill score metrics
- esp_lab.stats.compute_skill_seasonal(mod_da, mod_time, obs_da, climy0, climy1, nleadavg=1, nleads=1, resamp=0, detrend=False, monthly=False)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should contain either monthly or 3monthseason-average fields.
- Parameters
mod_da (DataArray) – a monthly or seasonally-averaged (de-drifted) hindcast DataArray dimensioned (Y,L,M,…)
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). Assumes mod_time.dt.month & mod_time.dt.year exist.
obs_da (DataArray) – a monthly or seasonally-averaged OBS DataArray dimensioned (time,…)
climy0 (int) – start year of climatology for computing anomalies
climy1 (int) – end year of climatology for computing anomalies
nleadavg (int (optional)) – sets temporal smoothing (e.g., nleadavg=3 to verify 3-year average fields).
nleads (int (optional)) – number of leads to include in skill computation (e.g., nleadavg=3,nleads=2 will return metrics for FY1-3, FY2-4)
resamp (bool (optional)) – number of resamplings of individual-member timeseries for computing forecast variance.
detrend (bool (optional)) – defaults to False; if set to True, skill scores will be computed after detrending
monthly (bool (optional)) – set to True if mod_da and obs_da are monthly means (skill will be computed for each lead month instead of each lead season)
- Returns
dsout (DataArray) – set of skill score metrics
- esp_lab.stats.cor_ci_bootyears(ts1, ts2, seed=None, nboots=1000, conf=95)[source]
Determine confidence intervals for correlation scores.
- Parameters
ts1 (array)
ts2 (array)
seed (int (optional)) – seed for random number generation, default None
nboots (int) – number boots (optional, default 1000)
conf (float (optional)) – confidence value; defaults to 95
- Returns
minci (float) – minimum confidence interval
maxci (float) – maximum confidence interval
- esp_lab.stats.detrend_linear(dat, dim)[source]
Linear detrend dat along the axis dim.
- Parameters
dat (array) – data which is to be detrended
dim (str) – dimension along which linear detrending is performed
- Returns
dat (DataArray) – detrended DataArray
- esp_lab.stats.leadtime_skill_seas(mod_da, mod_time, obs_da, detrend=False)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should represent 3-month seasonal averages (DJF, MAM, JJA, SON).
- Parameters
mod_da (DataArray) – a seasonally-averaged hindcast DataArray dimensioned (Y,L,M,…)
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). note: assumes mod_time.dt.month
obs_da (DataArray) – an OBS DataArray dimensioned (season,year,…)
detrend (optional) (bool) – defaults to False; if True, skill scores computed after detrending
- Returns
xr_dataset (DataArray) – set of skill score metrics
- esp_lab.stats.leadtime_skill_seas_resamp(mod_da, mod_time, obs_da, sampsize, N, detrend=False)[source]
Computes a suite of deterministic skill metrics given two DataArrays corresponding to model and observations, which must share the same lat/lon coordinates (if any). Assumes time coordinates are compatible (can be aligned). Both DataArrays should represent 3-month seasonal averages (DJF, MAM, JJA, SON).
Unlike leadtime_skill_seas(), this version resamples the mod_da member dimension (M) to generate a distribution of skill scores using a smaller ensemble size (N, where N<M). Returns the mean of the resampled skill score distribution.
- Parameters
mod_da (DataArray) – a seasonally-averaged hindcast DataArray dimensioned (Y,L,M,…)
mod_time (DataArray) – a hindcast time DataArray dimensioned (Y,L). Assumes mod_time.dt.month
obs_da (DataArray) – an OBS DataArray dimensioned (season,year,…)
sampsize (int) – sample size
N (int) – maximum dimension for resampling
detrend (bool (optional)) – defaults to False; if set to True, skill scores will be computed after detrending
- Returns
dsout (xarray) – mean of resampled skill score metrics
- esp_lab.stats.remove_drift(da, da_time, y1, y2)[source]
Function to convert raw DP DataArray into anomaly DP DataArray with leadtime-dependent climatology removed.
- Parameters
da (DP DataArray) – Raw DP DataArray with dimensions (Y,L,M,…)
da_time (DP DataArray) – Verification time of DP DataArray (Y,L)
y1 (int) – Start year of climatology
y2 (int) – End year of climatology
- Returns
da_anom (DP DataArray) – De-drifted DP DataArray
da_climo (DP DataArray) – Leadtime-dependent climatology