dubfi.fluxes.readobs

Read and filter observations as defined in configuration.

Added in version 0.1.0: (initial release)

Changed in version 0.1.1.

Attributes

INVERSION_FLAGS

List of flags indicating filtering of observations for the inversion.

Classes

ReadObs

Object that parses configuration and reads data.

Functions

get_flag(key)

Get integer representation of inversion flag name, see INVERSION_FLAGS.

coordinates_from_config(config, **kwargs)

Gather coordinates from input data, applying filtering as defined in config.

average_window(da, window[, min_diff])

Average data array over rolling time window.

data_from_config(config[, coordinates_only, ...])

Provide filtered data as defined in configuration.

Module Contents

dubfi.fluxes.readobs.INVERSION_FLAGS = ['used', 'ignored: time range', 'ignored: season', 'ignored: time of day', 'ignored: wind...

List of flags indicating filtering of observations for the inversion.

dubfi.fluxes.readobs.get_flag(key)

Get integer representation of inversion flag name, see INVERSION_FLAGS.

Added in version 0.1.1.

Parameters:

key (str)

Return type:

int

dubfi.fluxes.readobs.coordinates_from_config(config, **kwargs)

Gather coordinates from input data, applying filtering as defined in config.

Parameters:
  • config (str | dict) – configuration of path to configuration file (YAML)

  • **kwargs (any) – passed on to data_from_config()

Returns:

coordinates – dictionary with entries:

  • ssh: array of station and sampling height identifiers

  • time: list of time arrays, aligned with ssh

  • lon: array of station longitudes (degrees), aligned with ssh

  • lat: array of station latitudes (degrees), aligned with ssh

  • height: array of station heights (meters), aligned with ssh

  • flux_cat: array of flux category names

  • ens_size: number of meteorological ensemble members (integer)

Return type:

dict

dubfi.fluxes.readobs.average_window(da, window, min_diff=np.timedelta64(1, 'h'))

Average data array over rolling time window.

Parameters:
  • da (xr.DataArray) – data array that shall be averaged. This must have a dimension and coordinate “time” of dtype np.datetime64. This time coordinate must be sorted and the minimum distance between the coordinate values must be at least min_diff.

  • window (np.timedelta64) – time window for averaging. result[i] is the mean of da[j] for all j such that abs(da.time[i] - da.time[j]) < window.

  • min_diff (np.timedelta64, default=1h) – minimum distance between coordinate values in da.time. If the value is too large, results will be wrong. If the provided value is too small, performance will be worse.

Returns:

avg_da – da averaged over rolling time window. All dimensions and coordinates will be the same as in da. Internal order of data in memory may differ from da.

Return type:

xr.DataArray

class dubfi.fluxes.readobs.ReadObs(config, suffix_rx)

Object that parses configuration and reads data.

Added in version 0.1.1.

Provide filtered data as defined in configuration.

Parameters:
  • config (str | dict) – configuration of path to configuration file (YAML)

  • suffix_rx (str, default=r"_det.nc$") – regular expression for the input file suffix, use this to select the determinstic run without (_det.nc) or with (_det_letkf.nc) far-field correction, or the ensemble data (_ens.nc).

get_data(coordinates_only=False, return_flags=False)

Provide filtered data as defined in configuration.

Parameters:
  • coordinates_only (bool, default=False) – if true, return only the coordinates and drop all other data

  • return_flags (bool, default=False) – additionally return a time series of flags defining why which data point was used or not used in the inversion. In this case, results will not be filtered.

Yields:
  • xr.Dataset – datasets for each matching station and sampling height. Files are sorted alphabetically. Data are filtered unless return_flags is true.

  • xr.DataArray – only if return_flags: flag for each observation data point

Return type:

Generator[xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray], None, None]

filter_ds(ds, coordinates_only=False, return_flags=False)

Filter data in dataset, see get_data().

Parameters:
  • ds (xarray.Dataset)

  • coordinates_only (bool)

  • return_flags (bool)

Return type:

xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray] | None

dubfi.fluxes.readobs.data_from_config(config, coordinates_only=False, suffix_rx='_det\\.nc$', return_flags=False)

Provide filtered data as defined in configuration.

Parameters:
  • config (str | dict) – configuration of path to configuration file (YAML)

  • coordinates_only (bool, default=False) – if true, return only the coordinates and drop all other data

  • suffix_rx (str, default=r"_det.nc$") – regular expression for the input file suffix, use this to select the determinstic run without (_det.nc) or with (_det_letkf.nc) far-field correction, or the ensemble data (_ens.nc).

  • return_flags (bool, default=False) – additionally return a time series of flags defining why which data point was used or not used in the inversion

Yields:
  • xr.Dataset – filtered datasets for each matching station and sampling height. Files are sorted alphabetically.

  • xr.DataArray – only if return_flags: flag for each observation data point

Return type:

Generator[xarray.Dataset | tuple[xarray.Dataset, xarray.DataArray], None, None]

Changed in version 0.1.1.