Linear Algebra Implementations

DUBFI provides a linear algebra interface, see dubfi.linalg.

Dense matrices

This implementation only uses numpy arrays. It is focussed on simplicity and can have acceptable performance for very small datasets. It quickly becomes inefficient and requires huge amounts of memory when working with larger datasets.

This implementation serves as a reference that can be compared to other implementations. It aims for a simple, readable and reliable implementation and not on performance.

CSC sparse matrices

This implementation uses the compressed-column representation of sparse matrices. This reduces memory requirements compared to dense matrices if the matrices are sparse. This implementation has limited performance but remains numerically exact.

MPI with dense matrices

This default implementation is optimized for solving the inversion problem using many parallel worker processes. It is optimized for good performance but requires a careful configuration (entry segment_buffer) to make sure that the results are correct.

The MPI parallelization is based on a central approximation: Matrix elements that are approximately zero in the localization matrix (used for constructing the R matrix) are negligible in all square matrices in observation space. This is physically motivated by the assumption that there are not long-range correlations in space or time. Based on this approximation, the observation space is split into chunks that are processed independent of each other. The linear algebra implementation using MPI parallelization is thus not numerically exact, but it is scalable to large observation datasets while preserving a good performance.

The linear algebra framework can usually be used agnostic of the implementation. However, care must be taken when allowing the use of the MPI parallelization, which requires worker processes. The user shall define how many worker processes shall be created (environment variable MPI_WORKERS) and ensure that the worker processes end when the main process ends (call dubfi.quit_clean() in the main process).