dubfi.inversion.inversion¶
Optimization problem for Bayesian inversion.
Changed in version 0.1.3: add InvertorOptimizerPositive
Added in version 0.1.0: initial release
Classes¶
InversionResult produced by |
|
Abstract general inversion base class. |
|
Kalman smoother. |
|
Numerical solver for flux inversion problem. |
|
Numerical solver for flux inversion problem asserting positive scaling factors. |
Module Contents¶
- class dubfi.inversion.inversion.InversionResult¶
Bases:
namedtuple('InversionResult', ['s_post','b_post','sensitivity','averaging_kernel','mdm_post_eigenbasis','mdm_post_stdev_eigenbasis'],defaults=[None,None,None,None],module=__name__)InversionResult produced by
Invertor.- s_post¶
posterior scaling factors
- Type:
np.ndarray
- b_post¶
error covariance of s_post
- Type:
np.ndarray
- sensitivity¶
sensitivity of posterior to individual observations: derivative of s_post w.r.t. observations, array of shape (flux_cat, obs)
- Type:
np.ndarray | None
- averaging_kernel¶
averaging kernel: derivative of posterior sclaing factors w.r.t. true scaling factors, array of shape (flux_cat, flux_cat)
- Type:
np.ndarray | None
- mdm_post_eigenbasis¶
model-observation mismatch in eigenbasis of R (or similar dataset), one-dimensional array aligned with post_stdev_eigenbasis
- Type:
np.ndarray | None
- mdm_post_stdev_eigenbasis¶
sqrt of eigenvalues of R (or similar dataset), one-dimensional array aligned with mdm_post_eigenbasis
- Type:
np.ndarray | None np.ndarray | None
Changed in version 0.1.3.
Initialize self. See help(type(self)) for accurate signature.
- class dubfi.inversion.inversion.Invertor(y, prior, h, b)¶
Bases:
abc.ABCAbstract general inversion base class.
- Parameters:
prior (numpy.ndarray)
b (numpy.ndarray)
- property prior: numpy.ndarray¶
Prior vector of scaling factors.
- Return type:
numpy.ndarray
- property b: numpy.ndarray¶
Prior error covariance matrix of scaling factors.
- Return type:
numpy.ndarray
- abstractmethod solve()¶
Solve inversion problem.
- Returns:
named tuple, see
InversionResult- Return type:
- class dubfi.inversion.inversion.InvertorKalman(y, prior, h, b, r)¶
Bases:
InvertorKalman smoother.
Analytically solve inversion problem under the assumption that uncertainties are independent of optimization parameters (fluxes).
- Parameters:
prior (numpy.ndarray)
b (numpy.ndarray)
- solve()¶
Solve inversion problem.
- Returns:
named tuple, see
InversionResult- Return type:
- class dubfi.inversion.inversion.InvertorOptimizer(y, prior, h, b, r, norm_prefactor=0.0)¶
Bases:
InvertorNumerical solver for flux inversion problem.
Solve inversion problem by numerically minimizing the cost function, assuming that uncertainties may depend on the optimization parameters.
Solve inverse problem using full ensemble-estimated model uncertainty.
- Parameters:
y (AbstractVector) – vector of observations (usually minus model far-field or background)
prior (np.ndarray) – a priori estimate for scaling factors
h (ParametrizedVector) – model equivalents depending on scaling factors
b (np.ndarray) – error covariance matrix of prior scaling factors
r (ParametrizedOperator) – error covariance matrix of model-observation comparison, depends on scaling factors
norm_prefactor (float, default=0.0) –
- prefactor of normalization term in cost function.
prefactor of normalization term in cost function.
- property b: numpy.ndarray¶
Prior error covariance matrix of scaling factors.
- Return type:
numpy.ndarray
- property norm_prefactor: float¶
Normalization prefactor in the cost function.
- Return type:
float
- _check_before_solving()¶
Overwrite in inheriting classes to assert properties before solving.
- Return type:
None
- solve(*, method='trust-exact', hess='exact', tol=1e-06, output={}, **kwargs)¶
Solve the inverse problem: determine posterior scaling factors and uncertainties.
- Parameters:
method (str, default="trust-exact") – solver method, passed to
scipy.optimize.minimize(), see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.htmlhess ({"exact", "2-point", "3-point", "cs"}) – method to choose for the Hessian matrix in the solver. “exact” means that the Hessian is computed analytically in each iteration. Other strings are passed to
scipy.optimize.minimize()as approximation strategy. “cs” is not compatible with the MPI implementation.tol (float, default=1e-6) – solver tolerance, passed to
scipy.optimize.minimize()output (dict[str, bool]) – define what shall be included in the output. Allowed keys: “sensitivity”, “mdm_eigenbasis_statistics”
**kwargs – solver-specific options, passed to
scipy.optimize.minimize()as “options”
- Returns:
named tuple, see
InversionResult- Return type:
- kalman(return_chi2=False, return_sensitivity=False)¶
Kalman smoother: Compute posterior under assumption that R is constant.
- Parameters:
return_chi2 (bool, default=False) – Compute chi^2 to estimate how assumed uncertainties agree with the considered data. This has almost no influence on the performance.
return_sensitivity (bool, default=False) – Compute and return the sensitivity of the results to each observation.
- Returns:
- s_post: np.ndarray
posterior
- b_post: np.ndarray
posterior B matrix
- chi2: float (only if return_chi2)
chi^2, deviation from assumed probability distributions
- chi2_obs: float (only if return_chi2)
observation contribution to chi^2 (positive)
- chi2_fit: float (only if return_chi2)
fit parameter contribution to chi^2 (negative)
- ddof: int (only if return_chi2)
degrees of freedom in the observations. Ideally, one would expect that chi2 is approximately equal to ddof.
- sensitivity: str (only if return_sensitivity)
gradient of posterior scaling factors with respect to observations
- Return type:
dict with entries
- cost(s)¶
Cost function for solver.
\[J(s) = \tfrac{\alpha}{2} \log\{\det[R(s)]\} + \tfrac{1}{2} [y - H(s)]^\top R(s)^{-1} [y - H(s)] + J_\text{prior}(s)\]\(J\) = cost function
\(J_\text{prior}(s) = -\log(P(s))\) = prior cost function, in default case \(J_\text{prior}(s) = \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0)\)
\(\alpha\) = norm_prefactor is a tuning parameter.
\(y\) = vector of observations.
\(H\) = observation operator.
\(s\) = vector of scaling factors.
\(s_0\) = a priori vector of scaling factors.
\(R(s)\) = error covariance matrix for the model-observation comparison. This matrix must be real-symmetric and positive definite.
\(B\) = a priori error covariance matrix of the scaling factors. This matrix must be real-symmetric and positive definite.
The cost function is defined such that the probability for scaling factors \(s\) given observations \(y\) is \(P(s|y)=-\log(J(s))\) where \(J(s)\) implicitly depends on \(y\).
- Parameters:
s (numpy.ndarray)
- Return type:
numpy.float64
- cost_prior(s)¶
Prior contribution to cost function.
\[J_\text{prior}(s) = \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0)\]This is the first function called in
cost().- Parameters:
s (numpy.ndarray)
- Return type:
numpy.float64
- cost_grad(s)¶
Gradient (Jacobian) of
cost().In the notation defined in
cost():\[\begin{split}\begin{aligned} \nabla_s J(s) = & \tfrac{\alpha}{2} \mathop{\mathrm{Tr}}_y \big(\tilde{R}_s^{-1} \nabla_s \tilde{R}_s\big) \\ &+ [H(s)-y]^\top \tilde{R}_s^{-1} \nabla_s H(s) \\ &- \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\nabla_s \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] \\ &+ \nabla_s J_\text{prior}(s) \end{aligned}\end{split}\]- Parameters:
s (numpy.ndarray)
- Return type:
numpy.ndarray
- cost_prior_grad(s)¶
Prior contribution to gradient of cost function.
\[\nabla_s J_\text{prior}(s) = (s - s_0)^\top B^{-1}\]This is the first function called in
cost_grad().- Parameters:
s (numpy.ndarray)
- Return type:
numpy.ndarray
- cost_hess(s, return_sensitivity=False)¶
Hessian of
cost().In the notation defined in
cost()and using \(\partial_k=\tfrac{\partial}{\partial s_k}\):\[\begin{split}\begin{aligned} \partial_k \partial_l J(s) = & \tfrac{\alpha}{2} \mathop{\mathrm{Tr}} \big( \tilde{R}_s^{-1} \partial_k\partial_l \tilde{R}_s\big) \\ &- \tfrac{\alpha}{2} \mathop{\mathrm{Tr}} \big[\tilde{R}_s^{-1}(\partial_k \tilde{R}_s)\tilde{R}_s^{-1}(\partial_l \tilde{R}_s)\big] \\ &+ [\partial_k H(s)]^\top \tilde{R}_s^{-1} \partial_l H(s) \\ &+ [H(s)-y]^\top \tilde{R}_s^{-1} \partial_k\partial_l H(s) \\ &- [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k \tilde{R}) \tilde{R}_s^{-1} \partial_l H(s) + (k\leftrightarrow l) \\ &- \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] \\ &+ \tfrac12 [H(s)-y]^\top \tilde{R}_s^{-1} (\partial_k \tilde{R}_s) \tilde{R}_s^{-1} (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [H(s)-y] + (k\leftrightarrow l) \\ &+ \partial_k \partial_l J_\text{prior}(s) \end{aligned}\end{split}\]Optionally returned quantities are the posterior error covariance matrix of the scaling factors \(B^\text{post}\) (matrix inverse of the computed Hessian), the sensitivity defined as \(S=\nabla_y s^\text{post}\), and the averaging kernel \(A=(\nabla_s H)\cdot(\nabla_y s^\text{post})\) that approximates \(\frac{\partial s^\text{post}}{\partial s^\text{true}}\). When computing these quantities, it is assumed that the given \(s\) minimizes the cost function, i.e., \(\nabla_s J(s) = 0\).
Knowing that \(J(s)\) can be approximated by a quadratic form around its minimum and denoting \(s^\text{post}\equiv \mathop{\mathrm{arg\,min}}_s J(s)\), we can evaluate \(\nabla_y s^\text{post}\) using \(\nabla_y \nabla_s J|_{s^\text{post}} = -(\nabla_y s^\text{post})\cdot[\nabla_s \nabla_s J]_{s^\text{post}}\) (i.e., \(\partial_{y_i} \partial_{s_k} J|_{s^\text{post}} = - \sum_l (\partial_{y_i} s^\text{post}_l) [\partial_{s_l} \partial_{s_k} J]_{s^\text{post}} = - \sum_l (\partial_{y_i} s^\text{post}_l) (B^\text{post})^{-1}_{lk}\)). This leads to:
\[\begin{split}\begin{aligned} S_{ki} &= -\sum_l B^\text{post}_{kl} \frac{\partial^2}{\partial s_l \partial y_i} J \\ &= \sum_l B^\text{post}_{kl} e_i^\top \tilde{R}_s^{-1} \left\{ \partial_l H(s) + (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [y - H(s)] \right\}, \\ A_{kn} &= \sum_i S_{ki} \partial_n H(s)_i \\ &= \sum_l B^\text{post}_{kl} [\partial_n H(s)]^\top \tilde{R}_s^{-1} \left\{ \partial_l H(s) + (\partial_l \tilde{R}_s) \tilde{R}_s^{-1} [y - H(s)] \right\}. \end{aligned}\end{split}\]with \(e_i\) denoting a unit vector in observation space.
- Parameters:
s (numpy.ndarray)
return_sensitivity (bool)
- Return type:
numpy.ndarray | tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]
- cost_prior_hess(s)¶
Prior contribution to Hessian of cost function.
\[\partial_k \partial_l J_\text{prior}(s) = (B^{-1})_{lk}\]This is the first function called in
cost_hess().- Parameters:
s (numpy.ndarray)
- Return type:
numpy.ndarray
- check_derivatives(states=10, **kwargs)¶
Check whether the explicitly defined derivatives of
cost()are correct.- Parameters:
states (int | array, default=10) – int: number of random states that shall be generated to check derivatives. array: array of scaling factors for which derivatives shall be checked. Each row of the two-dimensional array represents one vector of scaling factors (one state).
**kwargs – passed to
dubfi.linalg.generic.check_derivatives()
- Returns:
errors – number of errors
- Return type:
int
- class dubfi.inversion.inversion.InvertorOptimizerPositive(y, prior, h, b, r, norm_prefactor=0.0, min_s=1e-12, reg_cutoff=0.001, reg_factor=1e-09)¶
Bases:
InvertorOptimizerNumerical solver for flux inversion problem asserting positive scaling factors.
Solve inversion problem by numerically minimizing the cost function, assuming that uncertainties may depend on the optimization parameters. The scaling factors are forced to be positive by the a priori uncertainty distribution.
Solve inverse problem using full ensemble-estimated model uncertainty.
- Parameters:
y (AbstractVector) – vector of observations (usually minus model far-field or background)
prior (np.ndarray) – a priori estimate for scaling factors
h (ParametrizedVector) – model equivalents depending on scaling factors
b (np.ndarray) – error covariance matrix of prior scaling factors
r (ParametrizedOperator) – error covariance matrix of model-observation comparison, depends on scaling factors
norm_prefactor (float, default=0.0) – prefactor of normalization term in cost function.
min_s (float, default=1e-12) – minimum allowed scaling factor
reg_cutoff (float, default=1e-3) – upper scaling factor cutoff for regularization term (\(s_\text{cutoff}\))
reg_factor (float, default=1e-9) – regularization prefactor (\(\rho\))
- _sanitize_s(s)¶
Make sure all elements of s are positive, modify s in-place.
- Parameters:
s (numpy.ndarray)
- Return type:
None
- _check_before_solving()¶
Assert that prior and initial scaling factors are positive.
- Return type:
None
- cost_prior(s)¶
Prior contribution to cost function.
\[ \begin{align}\begin{aligned}J_\text{prior}(s) &= \tfrac{1}{2} (s - s_0)^\top B^{-1} (s - s_0) + r(s) ,\\r(s) &= \rho^2 \sum_k \left( \frac{1}{2\min(s_k, s_\text{cutoff})^2} + \frac{\min(s_k, s_\text{cutoff})}{s_\text{cutoff}^3} \right)\end{aligned}\end{align} \]where each entry of \(s\) is ensured to be positive. The regularization \(r(s)\) is chosen such that it suppresses the probability for values of \(s_k\) close to zero but leaves probabilities above \(s_\text{cutoff}\) unchanged. The transition at \(s_k=s_\text{cutoff}\) has a continuous first derivative.
- Parameters:
s (numpy.ndarray)
- Return type:
numpy.float64
- cost_prior_grad(s)¶
Prior contribution to gradient of cost function.
\[\begin{split}\partial_k r(s) = \begin{cases} 0 & \text{ if } s_k > s_\text{cutoff} \\ \frac{\rho^2}{s_\text{cutoff}^3} - \frac{\rho^2}{s_k^3} & \text{ otherwise} \end{cases}\end{split}\]where each entry of \(s\) is ensured to be positive.
- Parameters:
s (numpy.ndarray)
- Return type:
numpy.ndarray
- cost_prior_hess(s)¶
Prior contribution to Hessian of cost function.
\[\begin{split}\partial_k \partial_l r(s) = \begin{cases} 3 \frac{\rho^2}{s_k^4} & \text{ if } k=l \text{ and } s_k < s_\text{cutoff} \\ 0 & \text{ otherwise} \end{cases}\end{split}\]where each entry of \(s\) is ensured to be positive.
- Parameters:
s (numpy.ndarray)
- Return type:
numpy.ndarray