API

InducingPoints.CoverTree — Type

CoverTree(ϵ::Real, lloyds::Bool=true, voronoi::Bool=true, metric::SemiMetric=Euclidean())

The CoverTree algorithm [1], recursively builds a tree for which the node will optimally cover the given dataset according to the metric distance.

Arguments:

ϵ::Real: Spatial resolution. Higher ϵ will result in less points.
lloyds::Bool: Use the centroid of the ball created around the sampled datapoint instead of the datapoint itself if no other inducing point is close
voronoi::Bool: Reattributes samples to each node at the end of the proecssing of each layer
metric::SemiMetric: Distance metric used to determine distance between points.

[1] Alexander Terenin, David R. Burt, Artem Artemev, Seth Flaxman, Mark van der Wilk, Carl Edward Rasmussen, Hong Ge: Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees: https://arxiv.org/abs/2210.07893

source

InducingPoints.Greedy — Type

Greedy(m::Int, s::Int)

m is the desired number of inducing points
s is the minibatch size on which to select a new inducing point

Greedy approach first proposed by Titsias[1]. Algorithm loops over minibatches of data and select the best ELBO improvement. Requires passing outputs y, the kernel and the noise as keyword arguments to inducingpoints.

[1] Titsias, M. Variational Learning of Inducing Variables in Sparse Gaussian Processes. Aistats 5, 567–574 (2009).

source

InducingPoints.GreedyVarSelection — Type

GreedyVarSelection(m::Int)

Greedy variance minimization approach to inducing point selection. Originally proposed by [1], and revisited more recently by [2]. m is the desired number of inducing points.

[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009. [2] - D. R. Burt, C. E. Rasmussen, and M. van der Wilk. Convergence of Sparse Variational Inference in Gaussian Processes Regression. Journal of Machine Learning Research, 2020.

source

InducingPoints.KmeansAlg — Type

KmeansAlg(m::Int, metric::SemiMetric=SqEuclidean(); nMarkov = 10, tol = 1e-3)

k-Means [1] initialization on the data X taking m inducing points. The seeding is computed via [2], nMarkov gives the number of MCMC steps for the seeding.

Arguments

k::Int : Number of inducing points
metric::SemiMetric : Metric used to compute the distances for the k-means algorithm

Keyword Arguments

nMarkov::Int : Number of random steps for the seeding
tol::Real : Tolerance for the kmeans algorithm

[1] Arthur, D. & Vassilvitskii, S. k-means++: The advantages of careful seeding. in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms 1027–1035 (Society for Industrial and Applied Mathematics, 2007). [2] Bachem, O., Lucic, M., Hassani, S. H. & Krause, A. Fast and Provably Good Seedings for k-Means. Advances in Neural Information Processing Systems 29 55–63 (2016) doi:10.1109/tmtt.2005.863818.

source

InducingPoints.OnlineIPSelection — Type

OIPS(ρ_accept=0.8; η=0.95, kmax=Inf, kmin=10, ρ_remove=Inf)
OIPS(kmax, η=0.98, kmin=10)

Online Inducing Points Selection. Method from [1]. Requires passing the kernel as keyword argument to inducingpoints.

[1] Galy-Fajou, T. & Opper, M Adaptive Inducing Points Selection for Gaussian Processes. arXiv:2107.10066v1 (2021).

source

InducingPoints.RandomSubset — Type

RandomSubset(m::Int)

Arguments

m::Int: Number of inducing points

Uniform sampling of a subset of m points of the data.

source

InducingPoints.SeqDPP — Type

SeqDPP()

Sequential sampling via Determinantal Point Processes. Requires passing a kernel::Kernel as a keyword argument to inducingpoints.

source

InducingPoints.StdDPP — Type

StdDPP()

Standard DPP (Determinantal Point Process) sampling. The size of the returned Z is not fixed (but is not allowed to be empty unlike in a classical DPP). The kernel is passed as a keyword argument to inducingpoints.

source

InducingPoints.StreamKmeans — Type

StreamKmeans(m_target::Int)

Online clustering algorithm [1] to select inducing points in a streaming setting. Reference : [1] Liberty, E., Sriharsha, R. & Sviridenko, M. An Algorithm for Online K-Means Clustering. arXiv:1412.5721 (2015).

source

InducingPoints.UniGrid — Type

UniGrid(m::Int)

where m is the number of points on each dimension. Adaptive uniform grid based on [1]. The resulting inducing points are stored in the memory-efficient custom type UniformGrid.

[1] Moreno-Muñoz, P., Artés-Rodríguez, A. & Álvarez, M. A. Continual Multi-task Gaussian Processes. (2019).

source

InducingPoints.UniformGrid — Type

UniformGrid{T,Titer} <: AbstractVector{T}

A memory-efficient custom object representing a wrapper around a Iterators.ProductIterator. Supports broadcasting and other relevant array methods, and avoids explicitly computing all points on the grid.

source

InducingPoints.Webscale — Type

Webscale(m::Int)

Online k-means algorithm based on [1].

[1] Sculley, D. Web-scale k-means clustering. in Proceedings of the 19th international conference on World wide web - WWW ’10 1177 (ACM Press, 2010). doi:10.1145/1772690.1772862.

source

InducingPoints.kDPP — Type

kDPP(m::Int)

k-DPP (Determinantal Point Process) will return a subset of X of size m, according to DPP probability. The kernel is passed as a keyword argument to inducingpoints.

source

InducingPoints.find_nearest_center — Function

Find the closest center to X among Z, return the index and the distance

source

InducingPoints.inducingpoints — Function

 inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractVector; [kwargs...])
 inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractMatrix; obsdim=1, [kwargs...])

Select inducing points according to the algorithm alg. For some algorithms, additional keyword arguments are required.

source

InducingPoints.inducingpoints — Method

 inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractVector; 
    y::AbstractVector, kernel::Kernel, noise::Real)
 inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractMatrix; 
    obsdim=1, y::AbstractVector, kernel::Kernel, noise::Real)

Select inducing points according using the Greedy algorithm. Requires as additional keyword arguments the outputs y, the kernel and the noise.

source

InducingPoints.inducingpoints — Method

inducingpoints(
    rng::AbstractRNG, alg::GreedyVarSelection, x::AbstractVector; kernel::Kernel
)

See GreedyVarSelection for more info. rng isn't actually used here.

source

InducingPoints.inducingpoints — Method

 inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractVector; kernel::Kernel)
 inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractMatrix; obsdim=1, kernel::Kernel)

Select inducing points according using Online Inducing Points Selection. Requires as additional keyword argument the kernel.

source

InducingPoints.inducingpoints — Method

 inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractVector; [weights::Vector=nothing])
 inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractMatrix; obsdim=1, [weights::Vector=nothing])

Select inducing points by taking a Random Subset. Optionally accepts a weight vector for the inputs X.

source

InducingPoints.inducingpoints — Method

 inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractVector; kernel::Kernel)
 inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractMatrix; obsdim=1, kernel::Kernel)

Select inducing points according using Sequential Determinantal Point Processes. Requires passing a kernel::Kernel as a keyword argument.

source

InducingPoints.kmeans_seeding — Method

Fast and efficient seeding for KMeans based on Fast and Provably Good Seeding for k-Means

source

InducingPoints.partial_pivoted_cholesky — Method

partial_pivoted_cholesky(C::AbstractMatrix{<:Real}, M::Int, tol::Real)

Computes the partial pivoted Cholesky factorisation of C. Runs for at most M steps, but will stop if the threshold provided by tol is met.

This implementation is directly translated from Algorithm 1 in [1].

source

InducingPoints.updateZ — Function

updateZ([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; kwargs...)

Return new vector of inducing points Z with data X and algorithm alg without changing the original one

source

InducingPoints.updateZ! — Function

updateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; [kwargs...])

Update inducing points Z with data X and algorithm alg. Requires additional keyword arguments for some algorithms. Also see InducingPoints.

source

InducingPoints.updateZ — Method

updateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OIPS, X::AbstractVector; kernel::Kernel)

Update inducing points Z with data X and the OnlineIPSelection algorithm. Requires the kernel as an additional keyword argument.

source