API
InducingPoints.CoverTree — TypeCoverTree(ϵ::Real, lloyds::Bool=true, voronoi::Bool=true, metric::SemiMetric=Euclidean())The CoverTree algorithm [1], recursively builds a tree for which the node will optimally cover the given dataset according to the metric distance.
Arguments:
ϵ::Real: Spatial resolution. Higherϵwill result in less points.lloyds::Bool: Use the centroid of the ball created around the sampled datapoint instead of the datapoint itself if no other inducing point is closevoronoi::Bool: Reattributes samples to each node at the end of the proecssing of each layermetric::SemiMetric: Distance metric used to determine distance between points.
[1] Alexander Terenin, David R. Burt, Artem Artemev, Seth Flaxman, Mark van der Wilk, Carl Edward Rasmussen, Hong Ge: Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees: https://arxiv.org/abs/2210.07893
InducingPoints.Greedy — TypeGreedy(m::Int, s::Int)mis the desired number of inducing pointssis the minibatch size on which to select a new inducing point
Greedy approach first proposed by Titsias[1]. Algorithm loops over minibatches of data and select the best ELBO improvement. Requires passing outputs y, the kernel and the noise as keyword arguments to inducingpoints.
[1] Titsias, M. Variational Learning of Inducing Variables in Sparse Gaussian Processes. Aistats 5, 567–574 (2009).
InducingPoints.GreedyVarSelection — TypeGreedyVarSelection(m::Int)Greedy variance minimization approach to inducing point selection. Originally proposed by [1], and revisited more recently by [2]. m is the desired number of inducing points.
[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009. [2] - D. R. Burt, C. E. Rasmussen, and M. van der Wilk. Convergence of Sparse Variational Inference in Gaussian Processes Regression. Journal of Machine Learning Research, 2020.
InducingPoints.KmeansAlg — TypeKmeansAlg(m::Int, metric::SemiMetric=SqEuclidean(); nMarkov = 10, tol = 1e-3)k-Means [1] initialization on the data X taking m inducing points. The seeding is computed via [2], nMarkov gives the number of MCMC steps for the seeding.
Arguments
k::Int: Number of inducing pointsmetric::SemiMetric: Metric used to compute the distances for the k-means algorithm
Keyword Arguments
nMarkov::Int: Number of random steps for the seedingtol::Real: Tolerance for thekmeansalgorithm
[1] Arthur, D. & Vassilvitskii, S. k-means++: The advantages of careful seeding. in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms 1027–1035 (Society for Industrial and Applied Mathematics, 2007). [2] Bachem, O., Lucic, M., Hassani, S. H. & Krause, A. Fast and Provably Good Seedings for k-Means. Advances in Neural Information Processing Systems 29 55–63 (2016) doi:10.1109/tmtt.2005.863818.
InducingPoints.OnlineIPSelection — TypeOIPS(ρ_accept=0.8; η=0.95, kmax=Inf, kmin=10, ρ_remove=Inf)
OIPS(kmax, η=0.98, kmin=10)Online Inducing Points Selection. Method from [1]. Requires passing the kernel as keyword argument to inducingpoints.
[1] Galy-Fajou, T. & Opper, M Adaptive Inducing Points Selection for Gaussian Processes. arXiv:2107.10066v1 (2021).
InducingPoints.RandomSubset — TypeRandomSubset(m::Int)Arguments
m::Int: Number of inducing points
Uniform sampling of a subset of m points of the data.
InducingPoints.SeqDPP — TypeSeqDPP()Sequential sampling via Determinantal Point Processes. Requires passing a kernel::Kernel as a keyword argument to inducingpoints.
InducingPoints.StdDPP — TypeStdDPP()Standard DPP (Determinantal Point Process) sampling. The size of the returned Z is not fixed (but is not allowed to be empty unlike in a classical DPP). The kernel is passed as a keyword argument to inducingpoints.
InducingPoints.StreamKmeans — TypeStreamKmeans(m_target::Int)Online clustering algorithm [1] to select inducing points in a streaming setting. Reference : [1] Liberty, E., Sriharsha, R. & Sviridenko, M. An Algorithm for Online K-Means Clustering. arXiv:1412.5721 (2015).
InducingPoints.UniGrid — TypeUniGrid(m::Int)where m is the number of points on each dimension. Adaptive uniform grid based on [1]. The resulting inducing points are stored in the memory-efficient custom type UniformGrid.
[1] Moreno-Muñoz, P., Artés-Rodríguez, A. & Álvarez, M. A. Continual Multi-task Gaussian Processes. (2019).
InducingPoints.UniformGrid — TypeUniformGrid{T,Titer} <: AbstractVector{T}A memory-efficient custom object representing a wrapper around a Iterators.ProductIterator. Supports broadcasting and other relevant array methods, and avoids explicitly computing all points on the grid.
InducingPoints.Webscale — TypeWebscale(m::Int)Online k-means algorithm based on [1].
[1] Sculley, D. Web-scale k-means clustering. in Proceedings of the 19th international conference on World wide web - WWW ’10 1177 (ACM Press, 2010). doi:10.1145/1772690.1772862.
InducingPoints.kDPP — TypekDPP(m::Int)k-DPP (Determinantal Point Process) will return a subset of X of size m, according to DPP probability. The kernel is passed as a keyword argument to inducingpoints.
InducingPoints.find_nearest_center — FunctionFind the closest center to X among Z, return the index and the distance
InducingPoints.inducingpoints — Function inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractVector; [kwargs...])
inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractMatrix; obsdim=1, [kwargs...])Select inducing points according to the algorithm alg. For some algorithms, additional keyword arguments are required.
InducingPoints.inducingpoints — Method inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractVector;
y::AbstractVector, kernel::Kernel, noise::Real)
inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractMatrix;
obsdim=1, y::AbstractVector, kernel::Kernel, noise::Real)Select inducing points according using the Greedy algorithm. Requires as additional keyword arguments the outputs y, the kernel and the noise.
InducingPoints.inducingpoints — Methodinducingpoints(
rng::AbstractRNG, alg::GreedyVarSelection, x::AbstractVector; kernel::Kernel
)See GreedyVarSelection for more info. rng isn't actually used here.
InducingPoints.inducingpoints — Method inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractVector; kernel::Kernel)
inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractMatrix; obsdim=1, kernel::Kernel)Select inducing points according using Online Inducing Points Selection. Requires as additional keyword argument the kernel.
InducingPoints.inducingpoints — Method inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractVector; [weights::Vector=nothing])
inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractMatrix; obsdim=1, [weights::Vector=nothing])Select inducing points by taking a Random Subset. Optionally accepts a weight vector for the inputs X.
InducingPoints.inducingpoints — Method inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractVector; kernel::Kernel)
inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractMatrix; obsdim=1, kernel::Kernel)Select inducing points according using Sequential Determinantal Point Processes. Requires passing a kernel::Kernel as a keyword argument.
InducingPoints.kmeans_seeding — MethodFast and efficient seeding for KMeans based on Fast and Provably Good Seeding for k-Means
InducingPoints.partial_pivoted_cholesky — Methodpartial_pivoted_cholesky(C::AbstractMatrix{<:Real}, M::Int, tol::Real)Computes the partial pivoted Cholesky factorisation of C. Runs for at most M steps, but will stop if the threshold provided by tol is met.
This implementation is directly translated from Algorithm 1 in [1].
[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009.
InducingPoints.updateZ — FunctionupdateZ([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; kwargs...)Return new vector of inducing points Z with data X and algorithm alg without changing the original one
InducingPoints.updateZ! — FunctionupdateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; [kwargs...])Update inducing points Z with data X and algorithm alg. Requires additional keyword arguments for some algorithms. Also see InducingPoints.
InducingPoints.updateZ — MethodupdateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OIPS, X::AbstractVector; kernel::Kernel)Update inducing points Z with data X and the OnlineIPSelection algorithm. Requires the kernel as an additional keyword argument.