Filter and sample WARM realizations using distributional, tail, and spectral criteria
Source:R/warm_filtering.R
filter_warm_pool.RdFilters an ensemble of annual WARM realizations against an observed annual reference series. Filtering is based on three families of criteria: distributional moments, tail mass behavior, and spectral similarity from the global wavelet spectrum.
The function computes pass or fail vectors for each filter family and builds a candidate pool. If the pool is smaller than the requested sample size, the function relaxes thresholds iteratively until enough candidates are found or the maximum number of relaxation iterations is reached.
Relaxation is adaptive. At each iteration, the currently most restrictive
active filter is relaxed, defined as the filter with the lowest pass rate
among the active filters. This is constrained by the relaxation order given
in relax_order, which sets which filters are eligible to be relaxed
and how wavelet relaxation is parameterized.
If the pool is still smaller than n_select after relaxation, a
deterministic fallback selects the n_select realizations with the
smallest absolute relative mean difference.
Usage
filter_warm_pool(
obs_series = NULL,
sim_series = NULL,
n_select = 5,
seed = NULL,
relax_order = c("wavelet", "sd", "tail_low", "tail_high", "mean"),
filter_bounds = list(),
wavelet_args = list(signif_level = 0.8, noise_type = "red", period_lower_limit = 2,
detrend = TRUE),
modwt_n_levels = NULL,
make_plots = FALSE,
parallel = FALSE,
n_cores = NULL,
cache_gws = FALSE,
verbose = FALSE
)Arguments
- obs_series
Numeric vector. Observed annual series used as the reference target.
- sim_series
Numeric matrix. Simulated annual realizations with years in rows and realizations in columns.
- n_select
Integer scalar. Number of realizations to return in the
selectedelement.- seed
Integer scalar or NULL. If provided, sets the random seed used for selecting alignment windows when
obs_seriesandsim_serieshave different lengths, and for sampling within the final candidate pool.- relax_order
Character vector. Relaxation priority ordering for the filter families. Must contain exactly:
"mean","sd","tail_low","tail_high","wavelet".- filter_bounds
Named list. Overrides for filtering thresholds and relaxation controls. Keys must be snake case and match those returned by
filter_warm_bounds_defaults().- wavelet_args
Named list. Parameters passed to
analyze_wavelet_spectrum()for observed and simulated series. Expected entries includesignif_level,noise_type,period_lower_limit, anddetrend.- modwt_n_levels
Integer or NULL. Number of MODWT levels used in WARM. This value is used only for diagnostics. If NULL, a value is estimated from the series length.
- make_plots
Logical scalar. If TRUE, compute diagnostic plots for the selected realizations and return them in
plots.- parallel
Logical scalar. If TRUE, allows spectral metrics to be computed in parallel inside
compute_spectral_metrics().- n_cores
Integer scalar or NULL. Number of worker processes to use for parallel execution. If NULL, an internal default is used.
- cache_gws
Logical scalar. If TRUE, cache simulated global wavelet spectra for diagnostics; this can increase memory use.
- verbose
Logical scalar. If TRUE, logs setup information, per iteration pass rates, and relaxation actions.
Value
Named list with the following elements:
- pool
Numeric matrix. Candidate pool of realizations that passed the final set of filters. Subset of columns from
sim_series.- selected
Numeric matrix. The
n_selectrealizations chosen from the pool. Subset of columns fromsim_series.- summary
Data frame. Pass counts and pass rates for each filter family, plus the selection mode used.
- diagnostics
List. Window metadata, relaxation log, spectral diagnostics, spectral metrics, and the final bounds used. When
cache_gws = TRUE, includesgws_cache.- plots
NULL or a named list of ggplot objects when
make_plots = TRUE.