dubfi.inversion.inversion

Optimization problem for Bayesian inversion.

Changed in version 0.1.3: add InvertorOptimizerPositive

Added in version 0.1.0: initial release

Classes

InversionResult

InversionResult produced by Invertor.

Invertor

Abstract general inversion base class.

InvertorKalman

Kalman smoother.

InvertorOptimizer

Numerical solver for flux inversion problem.

InvertorOptimizerPositive

Numerical solver for flux inversion problem asserting positive scaling factors.

Module Contents

class dubfi.inversion.inversion.InversionResult

Bases: namedtuple('InversionResult', ['s_post', 'b_post', 'sensitivity', 'averaging_kernel', 'mdm_post_eigenbasis', 'mdm_post_stdev_eigenbasis'], defaults=[None, None, None, None], module=__name__)

InversionResult produced by Invertor.

s_post

posterior scaling factors

Type:

np.ndarray

b_post

error covariance of s_post

Type:

np.ndarray

sensitivity

sensitivity of posterior to individual observations: derivative of s_post w.r.t. observations, array of shape (flux_cat, obs)

Type:

np.ndarray | None

averaging_kernel

averaging kernel: derivative of posterior sclaing factors w.r.t. true scaling factors, array of shape (flux_cat, flux_cat)

Type:

np.ndarray | None

mdm_post_eigenbasis

model-observation mismatch in eigenbasis of R (or similar dataset), one-dimensional array aligned with post_stdev_eigenbasis

Type:

np.ndarray | None

mdm_post_stdev_eigenbasis

sqrt of eigenvalues of R (or similar dataset), one-dimensional array aligned with mdm_post_eigenbasis

Type:

np.ndarray | None np.ndarray | None

Changed in version 0.1.3.

Initialize self. See help(type(self)) for accurate signature.

class dubfi.inversion.inversion.Invertor(y, prior, h, b)

Bases: abc.ABC

Abstract general inversion base class.

Parameters:
property prior: numpy.ndarray

Prior vector of scaling factors.

Return type:

numpy.ndarray

property b: numpy.ndarray

Prior error covariance matrix of scaling factors.

Return type:

numpy.ndarray

abstractmethod solve()

Solve inversion problem.

Returns:

named tuple, see InversionResult

Return type:

InversionResult

class dubfi.inversion.inversion.InvertorKalman(y, prior, h, b, r)

Bases: Invertor

Kalman smoother.

Analytically solve inversion problem under the assumption that uncertainties are independent of optimization parameters (fluxes).

Parameters:
solve()

Solve inversion problem.

Returns:

named tuple, see InversionResult

Return type:

InversionResult

class dubfi.inversion.inversion.InvertorOptimizer(y, prior, h, b, r, norm_prefactor=0.0)

Bases: Invertor

Numerical solver for flux inversion problem.

Solve inversion problem by numerically minimizing the cost function, assuming that uncertainties may depend on the optimization parameters.

Solve inverse problem using full ensemble-estimated model uncertainty.

Parameters:
  • y (AbstractVector) – vector of observations (usually minus model far-field or background)

  • prior (np.ndarray) – a priori estimate for scaling factors

  • h (ParametrizedVector) – model equivalents depending on scaling factors

  • b (np.ndarray) – error covariance matrix of prior scaling factors

  • r (ParametrizedOperator) – error covariance matrix of model-observation comparison, depends on scaling factors

  • norm_prefactor (float, default=0.0) –

    prefactor of normalization term in cost function.

    prefactor of normalization term in cost function.

property b: numpy.ndarray

Prior error covariance matrix of scaling factors.

Return type:

numpy.ndarray

property norm_prefactor: float

Normalization prefactor in the cost function.

Return type:

float

_check_before_solving()

Overwrite in inheriting classes to assert properties before solving.

Return type:

None

solve(*, method='trust-exact', hess='exact', tol=1e-06, output={}, **kwargs)

Solve the inverse problem: determine posterior scaling factors and uncertainties.

Parameters:
  • method (str, default="trust-exact") – solver method, passed to scipy.optimize.minimize(), see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html

  • hess ({"exact", "2-point", "3-point", "cs"}) – method to choose for the Hessian matrix in the solver. “exact” means that the Hessian is computed analytically in each iteration. Other strings are passed to scipy.optimize.minimize() as approximation strategy. “cs” is not compatible with the MPI implementation.

  • tol (float, default=1e-6) – solver tolerance, passed to scipy.optimize.minimize()

  • output (dict[str, bool]) – define what shall be included in the output. Allowed keys: “sensitivity”, “mdm_eigenbasis_statistics”

  • **kwargs – solver-specific options, passed to scipy.optimize.minimize() as “options”

Returns:

named tuple, see InversionResult

Return type:

InversionResult

kalman(return_chi2=False, return_sensitivity=False)

Kalman smoother: Compute posterior under assumption that R is constant.

Parameters:
  • return_chi2 (bool, default=False) – Compute chi^2 to estimate how assumed uncertainties agree with the considered data. This has almost no influence on the performance.

  • return_sensitivity (bool, default=False) – Compute and return the sensitivity of the results to each observation.

Returns:

s_post: np.ndarray

posterior

b_post: np.ndarray

posterior B matrix

chi2: float (only if return_chi2)

chi^2, deviation from assumed probability distributions

chi2_obs: float (only if return_chi2)

observation contribution to chi^2 (positive)

chi2_fit: float (only if return_chi2)

fit parameter contribution to chi^2 (negative)

ddof: int (only if return_chi2)

degrees of freedom in the observations. Ideally, one would expect that chi2 is approximately equal to ddof.

sensitivity: str (only if return_sensitivity)

gradient of posterior scaling factors with respect to observations

Return type:

dict with entries

cost(s)

Cost function for solver.

\[J(s) = \tfrac{\alpha}{2} \log\{\det[R(s)]\} + \tfrac{1}{2} [y - H(s)]^\top R(s)^{-1} [y - H(s)] + J_\text{prior}(s)\]
  • \(J\) = cost function

  • \(J_\text{prior}(s) = -\log(P(s))\) = prior cost function, in default case \(J_\text{prior}(s) = \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0)\)

  • \(\alpha\) = norm_prefactor is a tuning parameter.

  • \(y\) = vector of observations.

  • \(H\) = observation operator.

  • \(s\) = vector of scaling factors.

  • \(s_0\) = a priori vector of scaling factors.

  • \(R(s)\) = error covariance matrix for the model-observation comparison. This matrix must be real-symmetric and positive definite.

  • \(B\) = a priori error covariance matrix of the scaling factors. This matrix must be real-symmetric and positive definite.

The cost function is defined such that the probability for scaling factors \(s\) given observations \(y\) is \(P(s|y)=-\log(J(s))\) where \(J(s)\) implicitly depends on \(y\).

Parameters:

s (numpy.ndarray)

Return type:

numpy.float64

cost_prior(s)

Prior contribution to cost function.

\[J_\text{prior}(s) = \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0)\]

This is the first function called in cost().

Parameters:

s (numpy.ndarray)

Return type:

numpy.float64

cost_grad(s)

Gradient (Jacobian) of cost().

In the notation defined in cost():

\[\begin{split}\begin{aligned} \nabla_s J(s) = & \tfrac{\alpha}{2} \mathop{\mathrm{Tr}}_y \big(\tilde{R}_s^{-1} \nabla_s \tilde{R}_s\big) \\ &+ [H(s)-y]^\top \tilde{R}_s^{-1} \nabla_s H(s) \\ &- \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\nabla_s \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] \\ &+ \nabla_s J_\text{prior}(s) \end{aligned}\end{split}\]
Parameters:

s (numpy.ndarray)

Return type:

numpy.ndarray

cost_prior_grad(s)

Prior contribution to gradient of cost function.

\[\nabla_s J_\text{prior}(s) = (s - s_0)^\top B^{-1}\]

This is the first function called in cost_grad().

Parameters:

s (numpy.ndarray)

Return type:

numpy.ndarray

cost_hess(s, return_sensitivity=False)

Hessian of cost().

In the notation defined in cost() and using \(\partial_k=\tfrac{\partial}{\partial s_k}\):

\[\begin{split}\begin{aligned} \partial_k \partial_l J(s) = & \tfrac{\alpha}{2} \mathop{\mathrm{Tr}} \big( \tilde{R}_s^{-1} \partial_k\partial_l \tilde{R}_s\big) \\ &- \tfrac{\alpha}{2} \mathop{\mathrm{Tr}} \big[\tilde{R}_s^{-1}(\partial_k \tilde{R}_s)\tilde{R}_s^{-1}(\partial_l \tilde{R}_s)\big] \\ &+ [\partial_k H(s)]^\top \tilde{R}_s^{-1} \partial_l H(s) \\ &+ [H(s)-y]^\top \tilde{R}_s^{-1} \partial_k\partial_l H(s) \\ &- [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k \tilde{R}) \tilde{R}_s^{-1} \partial_l H(s) + (k\leftrightarrow l) \\ &- \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] \\ &+ \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k \tilde{R}_s) \tilde{R}_s^{-1} (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] + (k\leftrightarrow l) \\ &+ \partial_k \partial_l J_\text{prior}(s) \end{aligned}\end{split}\]

Optionally returned quantities are the posterior error covariance matrix of the scaling factors \(B^\text{post}\) (matrix inverse of the computed Hessian), the sensitivity defined as \(S=\nabla_y s^\text{post}\), and the averaging kernel \(A=(\nabla_s H)\cdot(\nabla_y s^\text{post})\) that approximates \(\frac{\partial s^\text{post}}{\partial s^\text{true}}\). When computing these quantities, it is assumed that the given \(s\) minimizes the cost function, i.e., \(\nabla_s J(s) = 0\).

Knowing that \(J(s)\) can be approximated by a quadratic form around its minimum and denoting \(s^\text{post}\equiv \mathop{\mathrm{arg\,min}}_s J(s)\), we can evaluate \(\nabla_y s^\text{post}\) using \(\nabla_y \nabla_s J|_{s^\text{post}} = -(\nabla_y s^\text{post})\cdot[\nabla_s \nabla_s J]_{s^\text{post}}\) (i.e., \(\partial_{y_i} \partial_{s_k} J|_{s^\text{post}} = - \sum_l (\partial_{y_i} s^\text{post}_l) [\partial_{s_l} \partial_{s_k} J]_{s^\text{post}} = - \sum_l (\partial_{y_i} s^\text{post}_l) (B^\text{post})^{-1}_{lk}\)). This leads to:

\[\begin{split}\begin{aligned} S_{ki} &= -\sum_l B^\text{post}_{kl} \frac{\partial^2}{\partial s_l \partial y_i} J \\ &= \sum_l B^\text{post}_{kl} e_i^\top \tilde{R}_s^{-1} \left\{ \partial_l H(s) + (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [y - H(s)] \right\}, \\ A_{kn} &= \sum_i S_{ki} \partial_n H(s)_i \\ &= \sum_l B^\text{post}_{kl} [\partial_n H(s)]^\top \tilde{R}_s^{-1} \left\{ \partial_l H(s) + (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [y - H(s)] \right\}. \end{aligned}\end{split}\]

with \(e_i\) denoting a unit vector in observation space.

Parameters:
  • s (numpy.ndarray)

  • return_sensitivity (bool)

Return type:

numpy.ndarray | tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]

cost_prior_hess(s)

Prior contribution to Hessian of cost function.

\[\partial_k \partial_l J_\text{prior}(s) = (B^{-1})_{lk}\]

This is the first function called in cost_hess().

Parameters:

s (numpy.ndarray)

Return type:

numpy.ndarray

check_derivatives(states=10, **kwargs)

Check whether the explicitly defined derivatives of cost() are correct.

Parameters:
  • states (int | array, default=10) – int: number of random states that shall be generated to check derivatives. array: array of scaling factors for which derivatives shall be checked. Each row of the two-dimensional array represents one vector of scaling factors (one state).

  • **kwargs – passed to dubfi.linalg.generic.check_derivatives()

Returns:

errors – number of errors

Return type:

int

class dubfi.inversion.inversion.InvertorOptimizerPositive(y, prior, h, b, r, norm_prefactor=0.0, min_s=1e-12, reg_cutoff=0.001, reg_factor=1e-09)

Bases: InvertorOptimizer

Numerical solver for flux inversion problem asserting positive scaling factors.

Solve inversion problem by numerically minimizing the cost function, assuming that uncertainties may depend on the optimization parameters. The scaling factors are forced to be positive by the a priori uncertainty distribution.

Solve inverse problem using full ensemble-estimated model uncertainty.

Parameters:
  • y (AbstractVector) – vector of observations (usually minus model far-field or background)

  • prior (np.ndarray) – a priori estimate for scaling factors

  • h (ParametrizedVector) – model equivalents depending on scaling factors

  • b (np.ndarray) – error covariance matrix of prior scaling factors

  • r (ParametrizedOperator) – error covariance matrix of model-observation comparison, depends on scaling factors

  • norm_prefactor (float, default=0.0) – prefactor of normalization term in cost function.

  • min_s (float, default=1e-12) – minimum allowed scaling factor

  • reg_cutoff (float, default=1e-3) – upper scaling factor cutoff for regularization term (\(s_\text{cutoff}\))

  • reg_factor (float, default=1e-9) – regularization prefactor (\(\rho\))

_sanitize_s(s)

Make sure all elements of s are positive, modify s in-place.

Parameters:

s (numpy.ndarray)

Return type:

None

_check_before_solving()

Assert that prior and initial scaling factors are positive.

Return type:

None

cost_prior(s)

Prior contribution to cost function.

\[ \begin{align}\begin{aligned}J_\text{prior}(s) &= \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0) + r(s) ,\\r(s) &= \rho^2 \sum_k \left( \frac{1}{2\min(s_k, s_\text{cutoff})^2} + \frac{\min(s_k, s_\text{cutoff})}{s_\text{cutoff}^3} \right)\end{aligned}\end{align} \]

where each entry of \(s\) is ensured to be positive. The regularization \(r(s)\) is chosen such that it suppresses the probability for values of \(s_k\) close to zero but leaves probabilities above \(s_\text{cutoff}\) unchanged. The transition at \(s_k=s_\text{cutoff}\) has a continuous first derivative.

Parameters:

s (numpy.ndarray)

Return type:

numpy.float64

cost_prior_grad(s)

Prior contribution to gradient of cost function.

\[\begin{split}\partial_k r(s) = \begin{cases} 0 & \text{ if } s_k > s_\text{cutoff} \\ \frac{\rho^2}{s_\text{cutoff}^3} - \frac{\rho^2}{s_k^3} & \text{ otherwise} \end{cases}\end{split}\]

where each entry of \(s\) is ensured to be positive.

Parameters:

s (numpy.ndarray)

Return type:

numpy.ndarray

cost_prior_hess(s)

Prior contribution to Hessian of cost function.

\[\begin{split}\partial_k \partial_l r(s) = \begin{cases} 3 \frac{\rho^2}{s_k^4} & \text{ if } k=l \text{ and } s_k < s_\text{cutoff} \\ 0 & \text{ otherwise} \end{cases}\end{split}\]

where each entry of \(s\) is ensured to be positive.

Parameters:

s (numpy.ndarray)

Return type:

numpy.ndarray