Comparison of Gaussian process software

Comparison of statistical analysis software From Wikipedia, the free encyclopedia

This is a comparison of statistical analysis software that allows doing inference with Gaussian processes often using approximations.

This article is written from the point of view of Bayesian statistics, which may use a terminology different from the one commonly used in kriging. The next section should clarify the mathematical/computational meaning of the information provided in the table independently of contextual terminology.

Description of columns

This section details the meaning of the columns in the table below.

Solvers

These columns are about the algorithms used to solve the linear system defined by the prior covariance matrix, i.e., the matrix built by evaluating the kernel.

  • Exact: whether generic exact algorithms are implemented. These algorithms are usually appropriate only up to some thousands of datapoints.
  • Specialized: whether specialized exact algorithms for specific classes of problems are implemented. Supported specialized algorithms may be indicated as:
    • Kronecker: algorithms for separable kernels on grid data.[1]
    • Toeplitz: algorithms for stationary kernels on uniformly spaced data.[2]
    • Semisep.: algorithms for semiseparable covariance matrices.[3]
    • Sparse: algorithms optimized for sparse covariance matrices.
    • Block: algorithms optimized for block diagonal covariance matrices.
    • Markov: algorithms for kernels which represent (or can be formulated as) a Markov process.[4]
  • Approximate: whether generic or specialized approximate algorithms are implemented. Supported approximate algorithms may be indicated as:
    • Sparse: algorithms based on choosing a set of "inducing points" in input space,[5] or more in general imposing a sparse structure on the inverse of the covariance matrix.
    • Hierarchical: algorithms which approximate the covariance matrix with a hierarchical matrix.[6]

Input

These columns are about the points on which the Gaussian process is evaluated, i.e. if the process is .

  • ND: whether multidimensional input is supported. If it is, multidimensional output is always possible by adding a dimension to the input, even without direct support.
  • Non-real: whether arbitrary non-real input is supported (for example, text or complex numbers).

Output

These columns are about the values yielded by the process, and how they are connected to the data used in the fit.

  • Likelihood: whether arbitrary non-Gaussian likelihoods are supported.
  • Errors: whether arbitrary non-uniform correlated errors on datapoints are supported for the Gaussian likelihood. Errors may be handled manually by adding a kernel component, this column is about the possibility of manipulating them separately. Partial error support may be indicated as:
    • iid: the datapoints must be independent and identically distributed.
    • Uncorrelated: the datapoints must be independent, but can have different distributions.
    • Stationary: the datapoints can be correlated, but the covariance matrix must be a Toeplitz matrix, in particular this implies that the variances must be uniform.

Hyperparameters

These columns are about finding values of variables which enter somehow in the definition of the specific problem but that can not be inferred by the Gaussian process fit, for example parameters in the formula of the kernel.

  • Prior: whether specifying arbitrary hyperpriors on the hyperparameters is supported.
  • Posterior: whether estimating the posterior is supported beyond point estimation, possibly in conjunction with other software.

If both the "Prior" and "Posterior" cells contain "Manually", the software provides an interface for computing the marginal likelihood and its gradient w.r.t. hyperparameters, which can be feed into an optimization/sampling algorithm, e.g., gradient descent or Markov chain Monte Carlo.

Linear transformations

These columns are about the possibility of fitting datapoints simultaneously to a process and to linear transformations of it.

  • Deriv.: whether it is possible to take an arbitrary number of derivatives up to the maximum allowed by the smoothness of the kernel, for any differentiable kernel. Example partial specifications may be the maximum derivability or implementation only for some kernels. Integrals can be obtained indirectly from derivatives.
  • Finite: whether finite arbitrary linear transformations are allowed on the specified datapoints.
  • Sum: whether it is possible to sum various kernels and access separately the processes corresponding to each addend. It is a particular case of finite linear transformation but it is listed separately because it is a common feature.

Comparison table

More information Name, License ...
Name License Language Solvers Input Output Hyperparameters Linear transformations Name
Exact
Specialized
Approxi­mate
ND
Non-real
Likelihood Errors Prior Posterior
Derivative
Finite Sum
PyMC Apache Python Yes Kronecker Sparse ND No Any Correlated Yes Yes No Yes Yes PyMC
Stan BSD, GPL custom Yes No No ND No Any Correlated Yes Yes No Yes Yes Stan
scikit-learn BSD Python Yes No No ND Yes Bernoulli Uncorrelated Manually Manually No No No scikit-learn
fbm
[7]
Free C Yes No No ND No Bernoulli, Poisson Uncorrelated, Stationary Many Yes No No Yes fbm
GPML
[8][7]
BSD MATLAB Yes No Sparse ND No Many i.i.d. Manually Manually No No No GPML
GPstuff
[7]
GNU GPL MATLAB, R Yes Markov Sparse ND No Many Correlated Many Yes First RBF No Yes GPstuff
GPy
[9]
BSD Python Yes No Sparse ND No Many Uncorrelated Yes Yes No No No GPy
GPflow
[9]
Apache Python Yes No Sparse ND No Many Uncorrelated Yes Yes No No No GPflow
GPyTorch
[10]
MIT Python Yes Toeplitz, Kronecker Sparse ND No Many Uncorrelated Yes Yes First RBF Manually Manually GPyTorch
GPvecchia
[11]
GNU GPL R Yes No Sparse, Hierarch­ical ND No Exponential family Uncorrelated No No No No No GPvecchia
pyGPs
[12]
BSD Python Yes No Sparse ND Graphs, Manually Bernoulli i.i.d. Manually Manually No No No pyGPs
gptk
[13]
BSD R Yes Block? Sparse ND No Gaussian No Manually Manually No No No gptk
celerite
[3]
MIT Python, Julia, C++ No Semisep.[a] No 1D No Gaussian Uncorrelated Manually Manually No No No celerite
george
[6]
MIT Python, C++ Yes No Hierarch­ical ND No Gaussian Uncorrelated Manually Manually No No Manually george
neural-tangents
[14][b]
Apache Python Yes Block, Kronecker No ND No Gaussian No No No No No No neural-tangents
DiceKriging
[15]
GNU GPL R Yes No No ND No? Gaussian Uncorrelated SCAD RBF MAP No No No DiceKriging
OpenTURNS
[16]
GNU LGPL Python, C++ Yes No No ND No Gaussian Uncorrelated Manually (no grad.) MAP No No No OpenTURNS
UQLab
[17]
Proprietary MATLAB Yes No No ND No Gaussian Correlated No MAP No No No UQLab
ooDACE[18] Proprietary MATLAB Yes No No ND No Gaussian Correlated No MAP No No No ooDACE
DACE Proprietary MATLAB Yes No No ND No Gaussian No No MAP No No No DACE
GpGp MIT R No No Sparse ND No Gaussian i.i.d. Manually Manually No No No GpGp
SuperGauss GNU GPL R, C++ No Toeplitz[c] No 1D No Gaussian No Manually Manually No No No SuperGauss
STK GNU GPL MATLAB Yes No No ND No Gaussian Uncorrelated Manually Manually No No Manually STK
GSTools GNU LGPL Python Yes No No ND No Gaussian Yes Yes Yes Yes No No GSTools
PyKrige BSD Python Yes No No 2D,3D No Gaussian i.i.d. No No No No No PyKrige
GPR Apache C++ Yes No Sparse ND No Gaussian i.i.d. Some, Manually Manually First No No GPR
celerite2 MIT Python No Semisep.[a] No 1D No Gaussian Uncorrelated Manually[d] Manually No No Yes celerite2
SMT
[19][20]
BSD Python Yes No Sparse, PODI[e], other ND No Gaussian i.i.d. Some Some First No No SMT
GPJax Apache Python Yes No Sparse ND Graphs Bernoulli No Yes Yes No No No GPJax
Stheno MIT Python Yes Low rank Sparse ND No Gaussian i.i.d. Manually Manually Approxi­mate No Yes Stheno
Egobox-gp
[22]
Apache Rust Yes No Sparse ND No Gaussian i.i.d. No MAP First No No Egobox-gp
Name License Language Exact
Specialized
Approxi­mate
ND
Non-real
Likelihood Errors Prior Posterior
Derivative
Finite Sum Name
Solvers Input Output Hyperparameters Linear transformations
Close

Notes

  1. celerite implements only a specific subalgebra of kernels which can be solved in .[3]
  2. neural-tangents is a specialized package for infinitely wide neural networks.
  3. SuperGauss implements a superfast Toeplitz solver with computational complexity .
  4. celerite2 has a PyMC3 interface.
  5. PODI (Proper Orthogonal Decomposition + Interpolation) is an approximation for high-dimensional multioutput regressions. The regression function is lower-dimensional than the outcomes, and the subspace is chosen with the PCA of the (outcome, dependent variable) data. Each principal component is modeled with an a priori independent Gaussian process.[21]

References

Related Articles

Wikiwand AI