dubfi.fluxes.dataprovider_mpi_worker

Flux inversion data interface for MPI worker process.

Changed in version 0.1.2: (renamed module)

Added in version 0.1.0: (initial release)

Classes

MpiDistMecReaderWorker

Read configuration and observation data for flux inversion.

Functions

_count_same_site_obs(ssh_lst, ssh_idcs, time)

Count observations at same time and station.

Module Contents

class dubfi.fluxes.dataprovider_mpi_worker.MpiDistMecReaderWorker

Bases: dubfi.fluxes.dataprovider.InsituDataProvider

Read configuration and observation data for flux inversion.

Trivial initialization function: only declarate attributes.

classmethod fromconfig()

Construct instance based on configuration file and data in files.

property config: dict

Inversion configuration.

Return type:

dict

property coords: dict

Data coordinates, see dubfi.fluxes.readobs.coordinates_from_config().

Return type:

dict

read_config(cfg_path)

Read configuration from file.

Parameters:

cfg_path (str) – path to configuration (YAML) file

Return type:

None

read_data()

Read data from files.

Note

Data are read and interpreted without checking the units.

get_Y()

Get Y vector (observation minus model prior).

Return type:

dubfi.linalg.mpi_worker.MpiVectorWorker

get_H()

Get H parametrized vector (observation operator).

Return type:

dubfi.linalg.mpi_worker.MpiLinParamVectorWorker

get_R()

Get R parametrized operator (error covariance matrix).

Return type:

dubfi.linalg.mpi_worker.MpiDensePostRWorker

dubfi.fluxes.dataprovider_mpi_worker._count_same_site_obs(ssh_lst, ssh_idcs, time)

Count observations at same time and station.

Count number of observations at same station and same time, irrespective of the sampling height. Return this number as an array alinged with the observations.

This is a helper function for MpiDistMecReaderWorker.read_data().

Scientific reasoning: Multiple observations at the same station and time have a strongly correlated model uncertainty. The inversion will assume that the model data mismatch at different sampling heights should agree up to the baseline uncertainty. This will in general underestimate the representativity error. Furthermore, it will give more weights to stations with more sampling heights. Both problems are mitigated by increasing the baseline uncertainty when multiple sampling heights are in use.

Parameters:
  • ssh_lst (list[str])

  • ssh_idcs (numpy.ndarray)

  • time (numpy.ndarray)

Return type:

numpy.ndarray