API

InducingPoints.CoverTreeType
CoverTree(ϵ::Real, lloyds::Bool=true, voronoi::Bool=true, metric::SemiMetric=Euclidean())

The CoverTree algorithm [1], recursively builds a tree for which the node will optimally cover the given dataset according to the metric distance.

Arguments:

  • ϵ::Real: Spatial resolution. Higher ϵ will result in less points.
  • lloyds::Bool: Use the centroid of the ball created around the sampled datapoint instead of the datapoint itself if no other inducing point is close
  • voronoi::Bool: Reattributes samples to each node at the end of the proecssing of each layer
  • metric::SemiMetric: Distance metric used to determine distance between points.

[1] Alexander Terenin, David R. Burt, Artem Artemev, Seth Flaxman, Mark van der Wilk, Carl Edward Rasmussen, Hong Ge: Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees: https://arxiv.org/abs/2210.07893

source
InducingPoints.GreedyType
Greedy(m::Int, s::Int)
  • m is the desired number of inducing points
  • s is the minibatch size on which to select a new inducing point

Greedy approach first proposed by Titsias[1]. Algorithm loops over minibatches of data and select the best ELBO improvement. Requires passing outputs y, the kernel and the noise as keyword arguments to inducingpoints.

[1] Titsias, M. Variational Learning of Inducing Variables in Sparse Gaussian Processes. Aistats 5, 567–574 (2009).

source
InducingPoints.GreedyVarSelectionType
GreedyVarSelection(m::Int)

Greedy variance minimization approach to inducing point selection. Originally proposed by [1], and revisited more recently by [2]. m is the desired number of inducing points.

[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009. [2] - D. R. Burt, C. E. Rasmussen, and M. van der Wilk. Convergence of Sparse Variational Inference in Gaussian Processes Regression. Journal of Machine Learning Research, 2020.

source
InducingPoints.KmeansAlgType
KmeansAlg(m::Int, metric::SemiMetric=SqEuclidean(); nMarkov = 10, tol = 1e-3)

k-Means [1] initialization on the data X taking m inducing points. The seeding is computed via [2], nMarkov gives the number of MCMC steps for the seeding.

Arguments

  • k::Int : Number of inducing points
  • metric::SemiMetric : Metric used to compute the distances for the k-means algorithm

Keyword Arguments

  • nMarkov::Int : Number of random steps for the seeding
  • tol::Real : Tolerance for the kmeans algorithm

[1] Arthur, D. & Vassilvitskii, S. k-means++: The advantages of careful seeding. in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms 1027–1035 (Society for Industrial and Applied Mathematics, 2007). [2] Bachem, O., Lucic, M., Hassani, S. H. & Krause, A. Fast and Provably Good Seedings for k-Means. Advances in Neural Information Processing Systems 29 55–63 (2016) doi:10.1109/tmtt.2005.863818.

source
InducingPoints.OnlineIPSelectionType
OIPS(ρ_accept=0.8; η=0.95, kmax=Inf, kmin=10, ρ_remove=Inf)
OIPS(kmax, η=0.98, kmin=10)

Online Inducing Points Selection. Method from [1]. Requires passing the kernel as keyword argument to inducingpoints.

[1] Galy-Fajou, T. & Opper, M Adaptive Inducing Points Selection for Gaussian Processes. arXiv:2107.10066v1 (2021).

source
InducingPoints.StdDPPType
StdDPP()

Standard DPP (Determinantal Point Process) sampling. The size of the returned Z is not fixed (but is not allowed to be empty unlike in a classical DPP). The kernel is passed as a keyword argument to inducingpoints.

source
InducingPoints.StreamKmeansType
StreamKmeans(m_target::Int)

Online clustering algorithm [1] to select inducing points in a streaming setting. Reference : [1] Liberty, E., Sriharsha, R. & Sviridenko, M. An Algorithm for Online K-Means Clustering. arXiv:1412.5721 (2015).

source
InducingPoints.UniGridType
UniGrid(m::Int)

where m is the number of points on each dimension. Adaptive uniform grid based on [1]. The resulting inducing points are stored in the memory-efficient custom type UniformGrid.

[1] Moreno-Muñoz, P., Artés-Rodríguez, A. & Álvarez, M. A. Continual Multi-task Gaussian Processes. (2019).

source
InducingPoints.UniformGridType
UniformGrid{T,Titer} <: AbstractVector{T}

A memory-efficient custom object representing a wrapper around a Iterators.ProductIterator. Supports broadcasting and other relevant array methods, and avoids explicitly computing all points on the grid.

source
InducingPoints.WebscaleType
Webscale(m::Int)

Online k-means algorithm based on [1].

[1] Sculley, D. Web-scale k-means clustering. in Proceedings of the 19th international conference on World wide web - WWW ’10 1177 (ACM Press, 2010). doi:10.1145/1772690.1772862.

source
InducingPoints.kDPPType
kDPP(m::Int)

k-DPP (Determinantal Point Process) will return a subset of X of size m, according to DPP probability. The kernel is passed as a keyword argument to inducingpoints.

source
InducingPoints.inducingpointsFunction
 inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractVector; [kwargs...])
 inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractMatrix; obsdim=1, [kwargs...])

Select inducing points according to the algorithm alg. For some algorithms, additional keyword arguments are required.

source
InducingPoints.inducingpointsMethod
 inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractVector; 
    y::AbstractVector, kernel::Kernel, noise::Real)
 inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractMatrix; 
    obsdim=1, y::AbstractVector, kernel::Kernel, noise::Real)

Select inducing points according using the Greedy algorithm. Requires as additional keyword arguments the outputs y, the kernel and the noise.

source
InducingPoints.inducingpointsMethod
inducingpoints(
    rng::AbstractRNG, alg::GreedyVarSelection, x::AbstractVector; kernel::Kernel
)

See GreedyVarSelection for more info. rng isn't actually used here.

source
InducingPoints.inducingpointsMethod
 inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractVector; kernel::Kernel)
 inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractMatrix; obsdim=1, kernel::Kernel)

Select inducing points according using Online Inducing Points Selection. Requires as additional keyword argument the kernel.

source
InducingPoints.inducingpointsMethod
 inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractVector; [weights::Vector=nothing])
 inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractMatrix; obsdim=1, [weights::Vector=nothing])

Select inducing points by taking a Random Subset. Optionally accepts a weight vector for the inputs X.

source
InducingPoints.inducingpointsMethod
 inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractVector; kernel::Kernel)
 inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractMatrix; obsdim=1, kernel::Kernel)

Select inducing points according using Sequential Determinantal Point Processes. Requires passing a kernel::Kernel as a keyword argument.

source
InducingPoints.partial_pivoted_choleskyMethod
partial_pivoted_cholesky(C::AbstractMatrix{<:Real}, M::Int, tol::Real)

Computes the partial pivoted Cholesky factorisation of C. Runs for at most M steps, but will stop if the threshold provided by tol is met.

This implementation is directly translated from Algorithm 1 in [1].

[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009.

source
InducingPoints.updateZFunction
updateZ([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; kwargs...)

Return new vector of inducing points Z with data X and algorithm alg without changing the original one

source
InducingPoints.updateZ!Function
updateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; [kwargs...])

Update inducing points Z with data X and algorithm alg. Requires additional keyword arguments for some algorithms. Also see InducingPoints.

source
InducingPoints.updateZMethod
updateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OIPS, X::AbstractVector; kernel::Kernel)

Update inducing points Z with data X and the OnlineIPSelection algorithm. Requires the kernel as an additional keyword argument.

source