Modules#

kenmerkendewaarden.data_retrieve#

Retrieve data from ddlpy and write to netcdf files including all metadata

kenmerkendewaarden.data_retrieve.read_measurements(dir_output: str, station: str, extremes: bool, return_xarray: bool = False, nap_correction: bool = False, drop_duplicates: bool = False)[source]#

Read the measurements netcdf as a dataframe.

Parameters#

dir_outputstr: Path where the measurements are stored.
stationstr: station name, for instance “HOEKVHLD”.
extremesbool: Whether to read measurements for waterlevel timeseries or extremes.
return_xarraybool, optional: Whether to return raw xarray.Dataset instead of a DataFrame. No support for nap_correction and drop_duplicates. The default is False.
nap_correctionbool, optional: Whether to correct for NAP2005. The default is False.
drop_duplicatesbool, optional: Whether to drop duplicated timesteps. The default is False.

Returns#

df_measpd.DataFrame: DataFrame with the measurements or extremes timeseries.

kenmerkendewaarden.data_retrieve.read_measurements_amount(dir_output: str, extremes: bool)[source]#

Read the measurements amount csv into a dataframe.

Parameters#

dir_outputstr: Path where the measurements are stored.
extremesbool: Whether to read measurements amount for waterlevel timeseries or extremes.

Returns#

df_amountpd.DataFrame: DataFrame with the amount of measurements per year.

kenmerkendewaarden.data_retrieve.retrieve_measurements(dir_output: str, station: str, extremes: bool, start_date: Timestamp, end_date: Timestamp, drop_if_constant: list | None = None)[source]#

Retrieve timeseries with measurements or extremes for a single station from the DDL with ddlpy.

Parameters#

dir_outputstr: Path where the measurement netcdf file will be stored.
stationstr: station name, for instance “HOEKVHLD”.
extremesbool: Whether to read measurements for waterlevel timeseries or extremes.
start_datepd.Timestamp (or anything understood by pd.Timestamp): start date of the measurements to be retrieved.
end_datepd.Timestamp (or anything understood by pd.Timestamp): end date of the measurements to be retrieved.
drop_if_constantlist, optional: A list of columns to drop if the row values are constant, to save disk space. The default is None.

Returns#

None

kenmerkendewaarden.data_retrieve.retrieve_measurements_amount(dir_output: str, station_list: list, extremes: bool, start_date: Timestamp, end_date: Timestamp)[source]#

Retrieve the amount of measurements or extremes for a single station from the DDL with ddlpy.

Parameters#

dir_outputstr: Path where the measurement netcdf file will be stored.
stationstr: station name, for instance “HOEKVHLD”.
extremesbool: Whether to read measurements for waterlevel timeseries or extremes.
start_datepd.Timestamp (or anything understood by pd.Timestamp): start date of the measurements to be retrieved.
end_datepd.Timestamp (or anything understood by pd.Timestamp): end date of the measurements to be retrieved.

Returns#

None

kenmerkendewaarden.data_analysis#

Data analysis like missings, duplicates, outliers and several other statistics

kenmerkendewaarden.data_analysis.derive_statistics(dir_output: str, station_list: list, extremes: bool)[source]#

Derive several statistics for the measurements of each station in the list.

Parameters#

dir_outputstr: Path where the measurement netcdf file will be stored.
stationlist: list of station names to derive statistics for, for instance [“HOEKVHLD”].
extremesbool: Whether to derive statistics from waterlevel timeseries or extremes.

Returns#

data_summarypd.DataFrame: A dataframe with several statistics for each station from the provided list.

kenmerkendewaarden.data_analysis.plot_measurements(df_meas: DataFrame, df_ext: DataFrame | None = None)[source]#

Generate a timeseries figure for the measurement timeseries (and extremes) of this station.

Parameters#

df_measpd.DataFrame: Dataframe with the measurement timeseries for a particular station.
df_extpd.DataFrame, optional: Dataframe with the measurement extremes for a particular station.

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.data_analysis.plot_measurements_amount(df: DataFrame, relative: bool = False)[source]#

Read the measurements amount csv and generate a pcolormesh figure of all years and stations. The colors indicate the absolute or relative number of measurements per year.

Parameters#

dfpd.DataFrame: Dataframe with the amount of measurements for several years per station.
relativebool, optional: Whether to scale the amount of measurements with the median of all measurement amounts for the same year. The default is False.

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.data_analysis.plot_stations(station_list: list, crs: int | None = None, add_labels: bool = False)[source]#

Plot the stations by subsetting a ddlpy catalog with the provided list of stations.

Parameters#

station_listlist: List of stations to plot the locations from.
crsint, optional: Coordinate reference system, for instance 28992. The coordinates retrieved from the DDL will be converted to this EPSG. The default is None.
add_labelsbool, optional: Whether to add station code labels in the figure, useful for debugging. The default is False.

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.tidalindicators#

Computation of tidal indicators from waterlevel extremes or timeseries

kenmerkendewaarden.tidalindicators.calc_HWLWtidalindicators(df_ext: DataFrame, min_coverage: float | None = None)[source]#

Computes several tidal extreme indicators from tidal extreme dataset.

Parameters#

df_extpd.DataFrame: Dataframe with extremes timeseries.
min_coveragefloat, optional: The minimal required coverage (between 0 to 1) of the df_ext timeseries to consider the statistics to be valid. It is the factor between the actual amount and the expected amount of high waters in the series. Note that the expected amount is not an exact extimate, so min_coverage=1 will probably result in nans even though all extremes are present. The default is None.

Returns#

dict_tidalindicatorsdict: Dictionary with several tidal indicators like yearly/monthly means.

kenmerkendewaarden.tidalindicators.calc_HWLWtidalrange(df_ext: DataFrame)[source]#

Compute the difference between a high water and the following low water. This tidal range is added as a column to the df_ext dataframe.

Parameters#

df_extpd.DataFrame: Dataframe with extremes timeseries.

Returns#

df_extpd.DataFrame: Input dataframe enriched with ‘tidalindicators’ and ‘HWLWno’ columns.

kenmerkendewaarden.tidalindicators.calc_wltidalindicators(df_meas: DataFrame, min_coverage: float | None = None)[source]#

Computes monthly and yearly means from waterlevel timeseries.

Parameters#

df_measpd.DataFrame: Dataframe with waterlevel timeseries.
min_coveragefloat, optional: The minimum percentage (from 0 to 1) of timeseries coverage to consider the statistics to be valid. The default is None.

Returns#

dict_tidalindicatorsdict: Dictionary with several tidal indicators like yearly/monthly means.

kenmerkendewaarden.tidalindicators.plot_tidalindicators(dict_indicators: dict)[source]#

Plot tidalindicators.

Parameters#

dict_indicatorsdict, optional: Dictionary as returned from kw.calc_wltidalindicators() and/or kw.calc_HWLWtidalindicators(). The default is None.

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.tidalextremes#

Created on Mon Jun 9 20:15:54 2025

@author: veenstra

kenmerkendewaarden.tidalextremes.calc_highest_lowest_astronomical_tide(df_meas: DataFrame) → tuple[source]#

Computing HAT and LAT from measurement timeseries, highest respectively lowest astronomical tides. This method derives the SA and SM components from 19 years of measurements (at once) and the other components from the most recent 4 years of measurements (per year, then vector averaged). The mean is overwitten with the slotgemiddelde, derived from the entire timerseries. The resulting component set is used to make a prediction of 19 years per year. The min and max from the resulting prediction timeseries are the LAT and HAT values.

The slowly varying SA and SM can only be derived from long timeseries covering an entire nodal cycle. These components are sensitive to timeseries length, so it is important to supply a sufficiently long timeseries. The other components are varying more quickly and for those only the last four years are used to represent the tidal dynamics at the end of the period instead of the average over the last 19 years. This also goes for the average, which is overwritten by the slotgemiddelde corresponding to the end of the period. This results in LAT/HAT values that are representative for the end of the supplied period.

Several alternative methods were considered, details are available in Deltares-research/kenmerkendewaarden#73

Parameters#

df_measpd.DataFrame: Dataframe with waterlevel timeseries.

Returns#

tuple: hat and lat values.

kenmerkendewaarden.slotgemiddelden#

Computation of slotgemiddelden of waterlevels and extremes

kenmerkendewaarden.slotgemiddelden.calc_slotgemiddelden(df_meas: DataFrame, df_ext: DataFrame | None = None, min_coverage: float | None = None, clip_physical_break: bool = False)[source]#

Compute slotgemiddelden from measurement timeseries and optionally also from extremes timeseries. A simple linear trend is used to avoid all pretend-accuracy. However, when fitting a linear trend on a limited amount of data, the nodal cycle and wind effects will cause the model fit to be inaccurate. It is wise to use at least 30 years of data for a valid fit, this is >1.5 times the nodal cycle.

Parameters#

df_measpd.DataFrame: the timeseries of measured waterlevels.
df_extpd.DataFrame, optional: the timeseries of extremes (high and low waters). The default is None.
min_coveragefloat, optional: Set yearly means to nans for years that do not have sufficient data coverage. The default is None.
clip_physical_breakbool, optional: Whether to exclude the part of the timeseries before physical breaks like estuary closures. The default is False.

Returns#

slotgemiddelden_dictdict: dictionary with yearly means and model fits, optionally also for extremes and corresponding tidal range.

kenmerkendewaarden.slotgemiddelden.plot_slotgemiddelden(slotgemiddelden_dict: dict, slotgemiddelden_dict_all: dict | None = None)[source]#

plot timeseries of yearly mean waterlevels and corresponding model fits.

Parameters#

slotgemiddelden_dictdict: Output from kw.calc_slotgemiddelden containing timeseries of yearly mean waterlevels and corresponding model fits.
slotgemiddelden_dict_alldict, optional: Optionally provide another dictionary with unfiltered mean waterlevels. Only used to plot the mean waterlevels (in grey). The default is None.

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.havengetallen#

Computation of havengetallen

kenmerkendewaarden.havengetallen.calc_HWLW_springneap(df_ext: DataFrame, min_coverage=None, moonculm_offset: int = 4)[source]#

kenmerkendewaarden.havengetallen.calc_havengetallen(df_ext: DataFrame, return_df_ext=False, min_coverage=None, moonculm_offset: int = 4)[source]#

havengetallen consist of the extreme (high and low) median values and the extreme median time delays with respect to the moonculmination. Besides that it computes the tide difference for each cycle and the tidal period. All these indicators are derived by dividing the extremes in hour-classes with respect to the moonculminination.

Parameters#

df_extpd.DataFrame: DataFrame with extremes (highs and lows, no aggers). The last 10 years of this timeseries are used to compute the havengetallen.
return_dfbool: Whether to return the enriched input dataframe. Default is False.
min_coveragefloat, optional: The minimal required coverage (between 0 to 1) of the df_ext timeseries to consider the statistics to be valid. It is the factor between the actual amount and the expected amount of high waters in the series. Note that the expected amount is not an exact extimate, so min_coverage=1 will probably result in nans even though all extremes are present. The default is None.
moonculm_offsetint, optional: Offset between moonculmination and extremes. Passed on to calc_HWLW_moonculm_combi. The default is 4, which corresponds to a 2-day offset, which is applicable to the Dutch coast.

Returns#

df_havengetallenpd.DataFrame: DataFrame with havengetallen for all hour-classes. 0 corresponds to spring, 6 corresponds to neap, mean is mean.
df_ext_culmpd.DataFrame: An enriched copy of the input DataFrame including a ‘culm_hr’ column.

kenmerkendewaarden.havengetallen.plot_HWLW_pertimeclass(df_ext: DataFrame, df_havengetallen: DataFrame)[source]#

Plot the extremes for each hour-class, including a median line.

Parameters#

df_extpd.DataFrame: DataFrame with measurement extremes, as provided by kw.calc_havengetallen().
df_havengetallenpd.DataFrame: DataFrame with havengetallen for all hour-classes, as provided by kw.calc_havengetallen().

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.havengetallen.plot_aardappelgrafiek(df_havengetallen: DataFrame)[source]#

Plot the median values of each hour-class in a aardappelgrafiek.

Parameters#

df_havengetallenpd.DataFrame: DataFrame with havengetallen for all hour-classes, as provided by kw.calc_havengetallen().

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.gemiddeldgetij#

Computation of gemiddelde getijkromme

kenmerkendewaarden.gemiddeldgetij.calc_gemiddeldgetij(df_meas: DataFrame, df_ext: DataFrame | None = None, min_coverage: float | None = None, freq: str = '60sec', nb: int = 0, nf: int = 0, scale_extremes: bool = False, scale_period: bool = False)[source]#

Generate an average tidal signal for average/spring/neap tide by doing a tidal analysis on a timeseries of measurements. The (subsets/adjusted) resulting tidal components are then used to make a raw prediction for average/spring/neap tide. These raw predictions can optionally be scaled in height (with havengetallen) and in time (to a fixed period of 12h25min). An n-number of backwards and forward repeats are added before the timeseries are returned, resulting in nb+nf+1 tidal periods.

Parameters#

df_measpd.DataFrame: Timeseries of waterlevel measurements. The last 10 years of this timeseries are used to compute the getijkrommes.
df_extpd.DataFrame, optional: Timeseries of waterlevel extremes (1/2 only). The last 10 years of this timeseries are used to compute the getijkrommes. The default is None.
min_coveragefloat, optional: The minimal required coverage of the df_ext timeseries. Passed on to calc_havengetallen(). The default is None.
freqstr, optional: Frequency of the prediction, a value of 60 seconds or lower is adivisable for decent results. The default is “60sec”.
nbint, optional: Amount of periods to repeat backward. The default is 0.
nfint, optional: Amount of periods to repeat forward. The default is 0.
scale_extremesbool, optional: Whether to scale extremes with havengetallen. The default is False.
scale_periodbool, optional: Whether to scale to 12h25min (for boi). The default is False.

Returns#

gemgetij_dictdict: dictionary with Dataframes with gemiddeld getij for mean, spring and neap tide.

kenmerkendewaarden.gemiddeldgetij.plot_gemiddeldgetij(gemgetij_dict: dict, gemgetij_dict_raw: dict | None = None, tick_hours: int | None = None)[source]#

Default plotting function for gemiddeldgetij dictionaries.

Parameters#

gemgetij_dictdict: dictionary as returned from kw.calc_gemiddeldgetij().
gemgetij_raw_dictdict, optional: dictionary as returned from kw.calc_gemiddeldgetij() e.g. with uncorrected values. The default is None.
ticks_12hbool, optional: whether to use xaxis ticks of 12 hours, otherwise automatic but less nice values

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

kenmerkendewaarden.overschrijding#

Computation of probabilities (overschrijdingsfrequenties) of extreme waterlevels

kenmerkendewaarden.overschrijding.calc_highest_extremes(df_ext: DataFrame, ascending: bool = False, num_extremes: int = 5)[source]#

Calculate the n amount of highest lowest extremes, by sorting the input dataframe with extremes from high to low (ascending=False) or low to high (ascending=True) and return the first n times and values.

Parameters#

df_extpd.DataFrame: The timeseries of extremes (high and low waters).
ascendingbool, optional: Whether to sort from high to low (ascending=False) or low to high (ascending=True). The default is False.
num_extremesint, optional: The number of extremes to return. The default is 5.

kenmerkendewaarden.overschrijding.calc_overschrijding(df_ext: ~pandas.core.frame.DataFrame, dist: dict = None, inverse: bool = False, clip_physical_break: bool = False, rule_type: str = None, rule_value: (<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>) = None, interp_freqs: list = None)[source]#

Compute exceedance/deceedance frequencies based on measured extreme waterlevels.

Parameters#

df_extpd.DataFrame, optional: The timeseries of extremes (high and low waters). The default is None.
distdict, optional: A pre-filled dictionary with a Hydra-NL and/or validation distribution. The default is None.
inversebool, optional: Whether to compute deceedance instead of exceedance frequencies. The default is False.
clip_physical_breakbool, optional: Whether to exclude the part of the timeseries before physical breaks like estuary closures. The default is False.
rule_typestr, optional: break/linear/None, passed on to apply_trendanalysis(). The default is None.
rule_value(pd.Timestamp, float), optional: Value corresponding to rule_type, pd.Timestamp (or anything understood by pd.Timestamp) in case of rule_type=’break’, float in case of rule_type=’linear’. The default is None.
interp_freqslist, optional: The frequencies to interpolate to, providing this will result in a “geinterpoleerd” key in the returned dictionary. The default is None.

Returns#

distdict: A dictionary with several distributions.

kenmerkendewaarden.overschrijding.plot_overschrijding(dist: dict)[source]#

plot overschrijding/onderschrijding

Parameters#

distdict: Dictionary as returned from kw.calc_overschrijding().

Returns#

figmatplotlib.figure.Figure: Figure handle.
axmatplotlib.axes._axes.Axes: Figure axis handle.

Modules#

kenmerkendewaarden.data_retrieve#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.data_analysis#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.tidalindicators#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.tidalextremes#

Parameters#

Returns#

kenmerkendewaarden.slotgemiddelden#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.havengetallen#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.gemiddeldgetij#

Parameters#

Returns#

Parameters#

Returns#

kenmerkendewaarden.overschrijding#

Parameters#

Parameters#

Returns#

Parameters#

Returns#

This Page