Software Structure

This program provides an abstraction layer for linear algebra operations and an inversion system working with that abstraction layer.

Linear Algebra

The modules dubfi.linalg.types and dubfi.linalg.generic provide abstract types for vectors, linear operators, and parametrized vectors or linear operators. The inversion operates on this abstraction layer, i.e., it can use any implementation of the linear algebra.

Different implementations may be required when creating linear algebra objects from data. Especially MPI parallelized linear algebra uses special tools (in dubfi.fluxes.dataprovider_mpi_worker and dubfi.fluxes.dataprovider_mpi) to read data directly in worker processes.

The implementations of the linear algebra interface are currently incomplete and focuses on those parts required for the inversion.

Generic Inversion

The equations solved by the inversion are described in detail in the scientific documentation. The inversion mainly consists of optimizing a cost function. The cost function, its gradient and Hesse matrix are implemented in dubfi.inversion.inversion using the abstract linear algebra interface.

Application to real data

The application of the inversion requires

  1. the a priori model-observation mismatch as dubfi.linalg.types.AbstractVector

  2. the observation operator as dubfi.linalg.generic.ParametrizedVector

  3. the model-observation error covariance matrix as dubfi.linalg.generic.ParametrizedOperator

In dubfi.fluxes, different modules create these objects for different linear algebra implementations. dubfi.fluxes.core provides a common interface for creating the linear algebra objects, running the inversion on these objects, and saving the result.

MPI worker processes

The MPI linear algebra implementation uses multiple processors with distributed memory. To achieve the described generic linear algebra interface, vector space is split into overlapping chunks that are assigned to separate processes.

The separate treatment of chunks is based on the fundamental assumption that all linear operators involved in the inversion are approximately diagonal and can be represented by overlapping block matrices along the diagonal. Each chunk of the vector space corresponds to one of these blocks. Instead of exchanging data between the worker processes as required when working with generic matrices, the worker processes operate independently and only send data to the main (or parent) process.

The assumption of approximately diagonal matrices is based on the localization used in the inversion, which suppresses long-scale correlations. Matrix elements far off the diagonal represent correlations between observations that should not be correlated. Thus, this approximation is physically motivated and not mathematically derived.

The worker processes receive instructions from the parent process. Details can be found in dubfi.linalg.mpi_worker_main.