dubfi.fluxes.readobs¶

Read and filter observations as defined in configuration.

Added in version 0.1.0: (initial release)

Changed in version 0.1.1.

Attributes¶

INVERSION_FLAGS

List of flags indicating filtering of observations for the inversion.

Classes¶

ReadObs

Object that parses configuration and reads data.

Functions¶

`get_flag`(key)	Get integer representation of inversion flag name, see `INVERSION_FLAGS`.
`coordinates_from_config`(config, **kwargs)	Gather coordinates from input data, applying filtering as defined in config.
`average_window`(da, window[, min_diff])	Average data array over rolling time window.
`data_from_config`(config[, coordinates_only, ...])	Provide filtered data as defined in configuration.

Module Contents¶

dubfi.fluxes.readobs.INVERSION_FLAGS = ['used', 'ignored: time range', 'ignored: season', 'ignored: time of day', 'ignored: wind...¶: List of flags indicating filtering of observations for the inversion.

dubfi.fluxes.readobs.get_flag(key)¶

Get integer representation of inversion flag name, see INVERSION_FLAGS.

Added in version 0.1.1.

Parameters:: key (str)
Return type:: int

dubfi.fluxes.readobs.coordinates_from_config(config, **kwargs)¶

Gather coordinates from input data, applying filtering as defined in config.

Parameters:

config (str | dict) – configuration of path to configuration file (YAML)
**kwargs (any) – passed on to data_from_config()

Returns:

coordinates – dictionary with entries:

ssh: array of station and sampling height identifiers
time: list of time arrays, aligned with ssh
lon: array of station longitudes (degrees), aligned with ssh
lat: array of station latitudes (degrees), aligned with ssh
height: array of station heights (meters), aligned with ssh
flux_cat: array of flux category names
ens_size: number of meteorological ensemble members (integer)

Return type:

dict

dubfi.fluxes.readobs.average_window(da, window, min_diff=np.timedelta64(1, 'h'))¶

Average data array over rolling time window.

Parameters:

da (xr.DataArray) – data array that shall be averaged. This must have a dimension and coordinate “time” of dtype np.datetime64. This time coordinate must be sorted and the minimum distance between the coordinate values must be at least min_diff.
window (np.timedelta64) – time window for averaging. result[i] is the mean of da[j] for all j such that abs(da.time[i] - da.time[j]) < window.
min_diff (np.timedelta64, default=1h) – minimum distance between coordinate values in da.time. If the value is too large, results will be wrong. If the provided value is too small, performance will be worse.

Returns:

avg_da – da averaged over rolling time window. All dimensions and coordinates will be the same as in da. Internal order of data in memory may differ from da.

Return type:

xr.DataArray

class dubfi.fluxes.readobs.ReadObs(config, suffix_rx)¶

Object that parses configuration and reads data.

Added in version 0.1.1.

Provide filtered data as defined in configuration.

Parameters:

config (str | dict) – configuration of path to configuration file (YAML)
suffix_rx (str, default=r"_det.nc$") – regular expression for the input file suffix, use this to select the determinstic run without (_det.nc) or with (_det_letkf.nc) far-field correction, or the ensemble data (_ens.nc).

get_data(coordinates_only=False, return_flags=False)¶

Provide filtered data as defined in configuration.

Parameters:

coordinates_only (bool, default=False) – if true, return only the coordinates and drop all other data
return_flags (bool, default=False) – additionally return a time series of flags defining why which data point was used or not used in the inversion. In this case, results will not be filtered.

Yields:

xr.Dataset – datasets for each matching station and sampling height. Files are sorted alphabetically. Data are filtered unless return_flags is true.
xr.DataArray – only if return_flags: flag for each observation data point

Return type:

Generator[xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray], None, None]

filter_ds(ds, coordinates_only=False, return_flags=False)¶

Filter data in dataset, see get_data().

Parameters:

ds (xarray.Dataset)
coordinates_only (bool)
return_flags (bool)

Return type:

xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray] | None

dubfi.fluxes.readobs.data_from_config(config, coordinates_only=False, suffix_rx='_det\\.nc$', return_flags=False)¶

Provide filtered data as defined in configuration.

Parameters:

config (str | dict) – configuration of path to configuration file (YAML)
coordinates_only (bool, default=False) – if true, return only the coordinates and drop all other data
suffix_rx (str, default=r"_det.nc$") – regular expression for the input file suffix, use this to select the determinstic run without (_det.nc) or with (_det_letkf.nc) far-field correction, or the ensemble data (_ens.nc).
return_flags (bool, default=False) – additionally return a time series of flags defining why which data point was used or not used in the inversion

Yields:

xr.Dataset – filtered datasets for each matching station and sampling height. Files are sorted alphabetically.
xr.DataArray – only if return_flags: flag for each observation data point

Return type:

Generator[xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray], None, None]

Changed in version 0.1.1.