FiniteGP and AbstractGP

Intended Audience

This page is best read once you have a sense of what this package is trying to achieve. Therefore, we recommend reading (or at least skimming) some of the examples before reading these docs.

Introduction

AbstractGPs provides the abstract type AbstractGP, and the concrete type FiniteGP. An AbstractGP, f, should be thought of as a distribution over functions. This means that the output of rand(f) would be a real-valued function. It's not usually possible to implement this though, so we don't.

A FiniteGP fx = f(x) represents the distribution over functions at the finite collection of points specified in x. fx is a multivariate Normal distribution, so rand(fx) produces a Vector of Reals.

A FiniteGP is the interesting object computationally, so if you create a new subtype MyNewGP of AbstractGP, and wish to make it interact well with the rest of the GP ecosystem, the methods that you must implement are not those directly involving MyNewGP, but rather those involving

FiniteGP{<:MyNewGP}

We provide two ways in which to do this. The first is to implement methods directly on FiniteGP{<:MyNewGP} – this is detailed in the FiniteGP APIs. The second is to implement some methods directly involving MyNewGP, and utilise default FiniteGP methods implemented in terms of these – this is detailed in the Internal AbstractGPs API. For example, the first method involves implementing methods like AbstractGPs.mean(fx::FiniteGP{<:MyNewGP}), while the second involves AbstractGPs.mean(f::MyNewGP, x::AbstractVector).

The second interface is generally easier to implement, but isn't always the best choice. See Which API should I implement? for further discussion.

FiniteGP APIs

Let f be an AbstractGP, x an AbstractVector representing a collection of inputs, and Σ a positive-definite matrix of size (length(x), length(x)). A FiniteGP represents the multivariate Gaussian induced by "indexing" into f at each point in x, and adding independent zero-mean noise with covariance matrix Σ:

fx = f(x, Σ)

# The code below is equivalent to the above, and is just for reference.
# When writing code, prefer the above syntax.
fx = AbstractGPs.FiniteGP(f, x, Σ)

The FiniteGP has two API levels. The Primary Public API should be supported by all FiniteGPs, while the Secondary Public API will only be supported by a subset. Use only the primary API when possible.

Primary Public API

These are user-facing methods. You can expect them to be implemented whenever you encounter a FiniteGP. If you are building something on top of AbstractGPs, try to implement it in terms of these functions.

Required Methods

Base.randFunction
rand(rng::AbstractRNG, f::FiniteGP, N::Int=1)

Obtain N independent samples from the marginals f using rng. Single-sample methods produce a length(f) vector. Multi-sample methods produce a length(f) × N Matrix.

julia> f = GP(Matern32Kernel());

julia> x = randn(11);

julia> rand(f(x)) isa Vector{Float64}
true

julia> rand(MersenneTwister(123456), f(x)) isa Vector{Float64}
true

julia> rand(f(x), 3) isa Matrix{Float64}
true

julia> rand(MersenneTwister(123456), f(x), 3) isa Matrix{Float64}
true
source
Random.rand!Function
rand!(rng::AbstractRNG, f::FiniteGP, y::AbstractVecOrMat{<:Real})

Obtain sample(s) from the marginals f using rng and write them to y.

If y is a matrix, then each column corresponds to an independent sample.

julia> f = GP(Matern32Kernel());

julia> x = randn(11);

julia> y = similar(x);

julia> rand!(f(x), y);

julia> rand!(MersenneTwister(123456), f(x), y);

julia> ys = similar(x, length(x), 3);

julia> rand!(f(x), ys);

julia> rand!(MersenneTwister(123456), f(x), ys);
source
AbstractGPs.marginalsFunction
marginals(f::FiniteGP)

Compute a vector of Normal distributions representing the marginals of f efficiently. In particular, the off-diagonal elements of cov(f(x)) are never computed.

julia> f = GP(Matern32Kernel());

julia> x = randn(11);

julia> fs = marginals(f(x));

julia> mean.(fs) == mean(f(x))
true

julia> std.(fs) == sqrt.(diag(cov(f(x))))
true
source
Distributions.logpdfMethod
logpdf(f::FiniteGP, y::AbstractVecOrMat{<:Real})

The logpdf of y under f if y isa AbstractVector. The logpdf of each column of y if y isa Matrix.

julia> f = GP(Matern32Kernel());

julia> x = randn(11);

julia> y = rand(f(x));

julia> logpdf(f(x), y) isa Real
true

julia> Y = rand(f(x), 3);

julia> logpdf(f(x), Y) isa AbstractVector{<:Real}
true
source
AbstractGPs.posteriorMethod
posterior(fx::FiniteGP, y::AbstractVector{<:Real})

Construct the posterior distribution over fx.f given observations y at fx.x made under noise fx.Σy. This is another AbstractGP object. See chapter 2 of [1] for a recap on exact inference in GPs. This posterior process has mean function

m_posterior(x) = m(x) + k(x, fx.x) inv(cov(fx)) (y - mean(fx))

and kernel

k_posterior(x, z) = k(x, z) - k(x, fx.x) inv(cov(fx)) k(fx.x, z)

where m and k are the mean function and kernel of fx.f, respectively.

source
Statistics.meanMethod
mean(fx::FiniteGP)

Compute the mean vector of fx.

julia> f = GP(Matern52Kernel());

julia> x = randn(11);

julia> mean(f(x)) == zeros(11)
true
source
Statistics.varMethod
var(f::FiniteGP)

Compute only the diagonal elements of cov(f).

Examples

julia> fx = GP(Matern52Kernel())(randn(10), 0.1);

julia> var(fx) == diag(cov(fx))
true
source

Optional methods

Default implementations are provided for these, but you may wish to specialise for performance.

StatsBase.mean_and_varMethod
mean_and_var(f::FiniteGP)

Compute both mean(f) and the diagonal elements of cov(f).

Sometimes more efficient than computing them separately, particularly for posteriors.

Examples

julia> fx = GP(SqExponentialKernel())(range(-3.0, 3.0; length=10), 0.1);

julia> mean_and_var(fx) == (mean(fx), var(fx))
true
source

Secondary Public API

Observe that the Primary Public API does not include a function to compute the covariance matrix of a FiniteGP. While the covariance matrix of any multivariate Gaussian is defined, it is not always a good idea to actually compute it. Fortunately, it's often the case that you're not actually interested in the covariance matrix per-se, rather the other quantities that you might use it to compute (logpdf, rand, posterior). This is similar to the well-known observation that you rarely need the inverse of a matrix, you just need to compute the inverse multiplied by something, so it's considered good practice to avoid ever explicitly computing the inverse of a matrix so as to avoid the numerical issues associated with it. This is important, for example, as TemporalGPs.jl is able to implement all of the Primary Public API in linear time in the dimension of the FiniteGP, as it never needs to evaluate the covariance matrix.

However, for many (probably the majority of) GPs, this acceleration isn't possible, and there is really nothing lost by explicitly evaluating the covariance matrix. We call this the Secondary Public API, because it's available a large proportion of the time, but should be avoided if at all possible.

Required Methods

Statistics.covMethod
cov(f::FiniteGP)

Compute the covariance matrix of fx.

Noise-free observations

julia> f = GP(Matern52Kernel());

julia> x = randn(11);

julia> cov(f(x)) == kernelmatrix(Matern52Kernel(), x)
true

Isotropic observation noise

julia> cov(f(x, 0.1)) == kernelmatrix(Matern52Kernel(), x) + 0.1 * I
true

Independent anisotropic observation noise

julia> s = rand(11);

julia> cov(f(x, s)) == kernelmatrix(Matern52Kernel(), x) + Diagonal(s)
true

Correlated observation noise

julia> A = randn(11, 11); S = A'A;

julia> cov(f(x, S)) == kernelmatrix(Matern52Kernel(), x) + S
true
source

Optional Methods

Default implementations are provided for these, but you may wish to specialise for performance.

StatsBase.mean_and_covMethod
mean_and_cov(f::FiniteGP)

Equivalent to (mean(f), cov(f)), but sometimes more efficient to compute them jointly than separately.

julia> fx = GP(SqExponentialKernel())(range(-3.0, 3.0; length=10), 0.1);

julia> mean_and_cov(fx) == (mean(fx), cov(fx))
true
source

Internal AbstractGPs API

This functionality is not intended to be used directly by the users, or those building functionality on top of this package – they should interact with Primary Public API. If you believe you really do need to interact with this level of the API, please open an issue to discuss as you may have a use-case that was missed during the design of this API.

As discussed at the top of this page, instances of subtypes of AbstractGP represent Gaussian processes – collections of jointly-Gaussian random variables, which may be infinite-dimensional.

Implementing the following API for your own AbstractGP subtype automatically implements both the Primary and Secondary public APIs above in terms of them.

Existing implementations of this interface include

  1. GP
  2. PosteriorGP
  3. ApproxPosteriorGP
  4. WrappedGP
  5. CompositeGP
  6. GaussianProcessProbabilisticProgramme

Required Methods

Statistics.meanMethod
mean(f::AbstractGP, x::AbstractVector)

Compute the mean vector of the multivariate Normal f(x).

source
Statistics.covMethod
cov(f::AbstractGP, x::AbstractVector, y::AbstractVector)

Compute the length(x) by length(y) cross-covariance matrix between f(x) and f(y).

source
Statistics.varMethod
var(f::AbstractGP, x::AbstractVector)

Compute only the diagonal elements of cov(f(x)).

source

Optional Methods

Default implementations are provided for these, but you may wish to specialise for performance.

Statistics.covMethod
cov(f::AbstractGP, x::AbstractVector)

Compute the length(x) by length(x) covariance matrix of the multivariate Normal f(x).

source
StatsBase.mean_and_covMethod
mean_and_cov(f::AbstractGP, x::AbstractVector)

Compute both mean(f(x)) and cov(f(x)). Sometimes more efficient than computing them separately, particularly for posteriors.

source
StatsBase.mean_and_varMethod
mean_and_var(f::AbstractGP, x::AbstractVector)

Compute both mean(f(x)) and the diagonal elements of cov(f(x)). Sometimes more efficient than computing them separately, particularly for posteriors.

source

Note that, while we could provide a default implementation for var(f, x) as diag(cov(f, x)), this is generally such an inefficient fallback, that we find it preferable to error if it's not implemented than to ever hit a fallback.

Which API should I implement?

To answer this question, you need to need to know whether or not the default implementations of the FiniteGP APIs work for your use case. There are a couple of reasons of which we are aware for why this might not be the case (see below) – possibly there are others.

If you are unsure, please open an issue to discuss.

You want to avoid computing the covariance matrix

We've already discussed this a bit on this page. The default implementations of the FiniteGP APIs rely on computing the covariance matrix. If your AbstractGP subtype needs to avoid computing the covariance matrix for performance reasons, then do not implement the Internal AbstractGPs API. Do implement the Primary Public API. Do not implement the Secondary Public API.

TemporalGPs.jl is an example of a package that does this – see the LTISDE implementation for an example. The same is true of the BayesianLinearRegressor type.

You don't want to use the default implementations

Perhaps you just don't like the default implementations because you don't want to make use of Cholesky factorisations. We don't have an example of this yet in Julia, however GPyTorch avoids the Cholesky factorisation in favour of iterative solvers.

In this situation, implement both the Internal AbstractGPs API and the FiniteGP APIs.

In this situation you will benefit less from code reuse inside AbstractGPs, but will continue to benefit from the ability of others use your code, and to take advantage of any existing functionality which requires types which adhere to the AbstractGPs API.

Mean functions

We define an API for prior mean functions with the abstract type MeanFunction and the mean_vector function.

AbstractGPs.MeanFunctionType
abstract type MeanFunction end

MeanFunction introduces an API for treating the prior mean function appropriately. On the abstract level, all MeanFunction are functions. However we generally want to evaluate them on a collection of inputs. To this effect, we provide the mean_vector(::MeanFunction, ::AbstractVector) function, which is equivalent to map but with possibilities of optimizations (for ZeroMean and ConstMean for example).

source
AbstractGPs.mean_vectorFunction
mean_vector(m::MeanFunction, x::AbstractVector)::AbstractVector{<:Real}

mean_vector is the function to call to apply a MeanFunction to a collection of inputs.

source

We provide standard mean functions like ZeroMean and ConstMean as well as CustomMean to simply wrap a function.

AbstractGPs.CustomMeanType
CustomMean{Tf} <: MeanFunction

A wrapper around whatever unary function you fancy. Must be able to be mapped over an AbstractVector of inputs.

Warning

CustomMean is generally sufficient for testing purposes, but care should be taken if attempting to differentiate through mean_vector with a CustomMean when using Zygote.jl. In particular, mean_vector(m::CustomMean, x) is implemented as map(m.f, x), which when x is a ColVecs or RowVecs will not differentiate correctly.

In such cases, you should implement mean_vector directly for your custom mean. For example, if f(x) = sum(x), you might implement mean_vector as

mean_vector(::CustomMean{typeof(f)}, x::ColVecs) = vec(sum(x.X; dims=1))
mean_vector(::CustomMean{typeof(f)}, x::RowVecs) = vec(sum(x.X; dims=2))

which avoids ever applying map to a ColVecs or RowVecs.

source

Testing Utilities

AbstractGPs provides several consistency tests in the AbstractGPs.TestUtils module. These tests will ensure that, for example, the size and type of everything produced by an implementation of the API is correct, and consistent with the other methods. It will not ensure correctness in any absolute sense though (e.g. that logpdf indeed computes what you wanted it to compute). Consequently, these tests should be seen as a set of necessary conditions for your code to be correct. They are not, however, sufficient.

You should only need to run one of the following test suites.

  1. If you implement the Primary Public API, run test_finitegp_primary_public_interface.
  2. If you implement both the Primary Public API and the Secondary Public API, then run test_finitegp_primary_and_secondary_public_interface.
  3. If you implement the Internal AbstractGPs API, run test_internal_abstractgps_interface.

Also see Which API should I implement? for more information about the most appropriate API to implement.

AbstractGPs.TestUtils.test_finitegp_primary_public_interfaceFunction
test_finitegp_primary_public_interface(
    rng::AbstractRNG, fx::FiniteGP; atol::Real=1e-12
)

Basic consistency tests for the Primary Public FiniteGP API. You should run these tests if you only implement the Primary Public API – see API section of docs for details about this API.

These are consistency checks, not correctness tests in the absolute sense. For example, these tests ensure that samples generated by rand are of the correct size, but does not check that they come from the intended distribution.

source
AbstractGPs.TestUtils.test_finitegp_primary_and_secondary_public_interfaceFunction
test_finitegp_primary_and_secondary_public_interface(
    rng::AbstractRNG, fx::FiniteGP; atol::Real=1e-12
)

Basic consistency tests for both the Primary and Secondary Public FiniteGP APIs. Runs test_finitegp_primary_public_interface as part of these tests. You should run these tests if you implement both the primary and secondary public APIs – see API section of the docs for more information about these APIs.

These are consistency checks, not correctness tests in the absolute sense. For example, these tests ensure that samples generated by rand are of the correct size, but does not check that they come from the intended distribution.

source
AbstractGPs.TestUtils.test_internal_abstractgps_interfaceFunction
test_internal_abstractgps_interface(
    rng::AbstractRNG,
    f::AbstractGP,
    x::AbstractVector,
    z::AbstractVector;
    atol=1e-12,
    σ²::Real=1e-1,
    jitter::Real=AbstractGPs.default_σ²,
)

Basic consistency tests for the Internal AbstractGPs API. Runs test_finitegp_primary_and_secondary_public_interface as part of these tests. Run these tests if you implement the Internal AbstractGPs API – see the API section of the docs for more information about this API.

These are consistency checks, not correctness tests in the absolute sense. For example, these tests ensure that samples generated by rand are of the correct size, but does not check that they come from the intended distribution.

kwargs

  • atol: the minimum eigenvalue of cov(f, x) must be greater than this.
  • σ²: the observation noise under which things are observed.
  • jitter: constant added to diagonal of cov(f, x) and cov(f, z) before inversion.
source