Linear Algebra Implementations
==============================

DUBFI provides a linear algebra interface, see :py:mod:`dubfi.linalg`.

Dense matrices
--------------

This implementation only uses numpy arrays.
It is focussed on simplicity and can have acceptable performance for very small datasets.
It quickly becomes inefficient and requires huge amounts of memory when working with larger datasets.

This implementation serves as a reference that can be compared to other implementations.
It aims for a simple, readable and reliable implementation and not on performance.


:term:`CSC` sparse matrices
---------------------------

This implementation uses the compressed-column representation of sparse matrices.
This reduces memory requirements compared to dense matrices if the matrices are sparse.
This implementation has limited performance but remains numerically exact.


:term:`MPI` with dense matrices
-------------------------------

This default implementation is optimized for solving the inversion problem using many parallel worker processes.
It is optimized for good performance but requires a careful configuration (entry :code:`segment_buffer`) to make sure that the results are correct.

The :term:`MPI` parallelization is based on a central approximation:
Matrix elements that are approximately zero in the localization matrix (used for constructing the R matrix) are negligible in all square matrices in observation space.
This is physically motivated by the assumption that there are not long-range correlations in space or time.
Based on this approximation, the observation space is split into chunks that are processed independent of each other.
The linear algebra implementation using :term:`MPI` parallelization is thus not numerically exact, but it is scalable to large observation datasets while preserving a good performance.

The linear algebra framework can usually be used agnostic of the implementation.
However, care must be taken when allowing the use of the :term:`MPI` parallelization, which requires worker processes.
The user shall define how many worker processes shall be created (environment variable :code:`MPI_WORKERS`) and ensure that the worker processes end when the main process ends (call :py:func:`dubfi.quit_clean` in the main process).