API
InducingPoints.CoverTree
— TypeCoverTree(ϵ::Real, lloyds::Bool=true, voronoi::Bool=true, metric::SemiMetric=Euclidean())
The CoverTree algorithm [1], recursively builds a tree for which the node will optimally cover the given dataset according to the metric
distance.
Arguments:
ϵ::Real
: Spatial resolution. Higherϵ
will result in less points.lloyds::Bool
: Use the centroid of the ball created around the sampled datapoint instead of the datapoint itself if no other inducing point is closevoronoi::Bool
: Reattributes samples to each node at the end of the proecssing of each layermetric::SemiMetric
: Distance metric used to determine distance between points.
[1] Alexander Terenin, David R. Burt, Artem Artemev, Seth Flaxman, Mark van der Wilk, Carl Edward Rasmussen, Hong Ge: Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees: https://arxiv.org/abs/2210.07893
InducingPoints.Greedy
— TypeGreedy(m::Int, s::Int)
m
is the desired number of inducing pointss
is the minibatch size on which to select a new inducing point
Greedy approach first proposed by Titsias[1]. Algorithm loops over minibatches of data and select the best ELBO improvement. Requires passing outputs y
, the kernel
and the noise
as keyword arguments to inducingpoints
.
[1] Titsias, M. Variational Learning of Inducing Variables in Sparse Gaussian Processes. Aistats 5, 567–574 (2009).
InducingPoints.GreedyVarSelection
— TypeGreedyVarSelection(m::Int)
Greedy variance minimization approach to inducing point selection. Originally proposed by [1], and revisited more recently by [2]. m
is the desired number of inducing points.
[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009. [2] - D. R. Burt, C. E. Rasmussen, and M. van der Wilk. Convergence of Sparse Variational Inference in Gaussian Processes Regression. Journal of Machine Learning Research, 2020.
InducingPoints.KmeansAlg
— TypeKmeansAlg(m::Int, metric::SemiMetric=SqEuclidean(); nMarkov = 10, tol = 1e-3)
k-Means [1] initialization on the data X
taking m
inducing points. The seeding is computed via [2], nMarkov
gives the number of MCMC steps for the seeding.
Arguments
k::Int
: Number of inducing pointsmetric::SemiMetric
: Metric used to compute the distances for the k-means algorithm
Keyword Arguments
nMarkov::Int
: Number of random steps for the seedingtol::Real
: Tolerance for thekmeans
algorithm
[1] Arthur, D. & Vassilvitskii, S. k-means++: The advantages of careful seeding. in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms 1027–1035 (Society for Industrial and Applied Mathematics, 2007). [2] Bachem, O., Lucic, M., Hassani, S. H. & Krause, A. Fast and Provably Good Seedings for k-Means. Advances in Neural Information Processing Systems 29 55–63 (2016) doi:10.1109/tmtt.2005.863818.
InducingPoints.OnlineIPSelection
— TypeOIPS(ρ_accept=0.8; η=0.95, kmax=Inf, kmin=10, ρ_remove=Inf)
OIPS(kmax, η=0.98, kmin=10)
Online Inducing Points Selection. Method from [1]. Requires passing the kernel
as keyword argument to inducingpoints
.
[1] Galy-Fajou, T. & Opper, M Adaptive Inducing Points Selection for Gaussian Processes. arXiv:2107.10066v1 (2021).
InducingPoints.RandomSubset
— TypeRandomSubset(m::Int)
Arguments
m::Int
: Number of inducing points
Uniform sampling of a subset of m
points of the data.
InducingPoints.SeqDPP
— TypeSeqDPP()
Sequential sampling via Determinantal Point Processes. Requires passing a kernel::Kernel
as a keyword argument to inducingpoints
.
InducingPoints.StdDPP
— TypeStdDPP()
Standard DPP (Determinantal Point Process) sampling. The size of the returned Z
is not fixed (but is not allowed to be empty unlike in a classical DPP). The kernel is passed as a keyword argument to inducingpoints
.
InducingPoints.StreamKmeans
— TypeStreamKmeans(m_target::Int)
Online clustering algorithm [1] to select inducing points in a streaming setting. Reference : [1] Liberty, E., Sriharsha, R. & Sviridenko, M. An Algorithm for Online K-Means Clustering. arXiv:1412.5721 (2015).
InducingPoints.UniGrid
— TypeUniGrid(m::Int)
where m
is the number of points on each dimension. Adaptive uniform grid based on [1]. The resulting inducing points are stored in the memory-efficient custom type UniformGrid
.
[1] Moreno-Muñoz, P., Artés-Rodríguez, A. & Álvarez, M. A. Continual Multi-task Gaussian Processes. (2019).
InducingPoints.UniformGrid
— TypeUniformGrid{T,Titer} <: AbstractVector{T}
A memory-efficient custom object representing a wrapper around a Iterators.ProductIterator
. Supports broadcasting and other relevant array methods, and avoids explicitly computing all points on the grid.
InducingPoints.Webscale
— TypeWebscale(m::Int)
Online k-means algorithm based on [1].
[1] Sculley, D. Web-scale k-means clustering. in Proceedings of the 19th international conference on World wide web - WWW ’10 1177 (ACM Press, 2010). doi:10.1145/1772690.1772862.
InducingPoints.kDPP
— TypekDPP(m::Int)
k-DPP (Determinantal Point Process) will return a subset of X
of size m
, according to DPP probability. The kernel is passed as a keyword argument to inducingpoints
.
InducingPoints.find_nearest_center
— FunctionFind the closest center to X among Z, return the index and the distance
InducingPoints.inducingpoints
— Function inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractVector; [kwargs...])
inducingpoints([rng::AbstractRNG], alg::AIPSA, X::AbstractMatrix; obsdim=1, [kwargs...])
Select inducing points according to the algorithm alg
. For some algorithms, additional keyword arguments are required.
InducingPoints.inducingpoints
— Method inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractVector;
y::AbstractVector, kernel::Kernel, noise::Real)
inducingpoints([rng::AbstractRNG], alg::Greedy, X::AbstractMatrix;
obsdim=1, y::AbstractVector, kernel::Kernel, noise::Real)
Select inducing points according using the Greedy algorithm. Requires as additional keyword arguments the outputs y
, the kernel
and the noise
.
InducingPoints.inducingpoints
— Methodinducingpoints(
rng::AbstractRNG, alg::GreedyVarSelection, x::AbstractVector; kernel::Kernel
)
See GreedyVarSelection
for more info. rng
isn't actually used here.
InducingPoints.inducingpoints
— Method inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractVector; kernel::Kernel)
inducingpoints([rng::AbstractRNG], alg::OIPS, X::AbstractMatrix; obsdim=1, kernel::Kernel)
Select inducing points according using Online Inducing Points Selection. Requires as additional keyword argument the kernel
.
InducingPoints.inducingpoints
— Method inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractVector; [weights::Vector=nothing])
inducingpoints([rng::AbstractRNG], alg::RandomSubset, X::AbstractMatrix; obsdim=1, [weights::Vector=nothing])
Select inducing points by taking a Random Subset. Optionally accepts a weight vector for the inputs X
.
InducingPoints.inducingpoints
— Method inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractVector; kernel::Kernel)
inducingpoints([rng::AbstractRNG], alg::SeqDPP, X::AbstractMatrix; obsdim=1, kernel::Kernel)
Select inducing points according using Sequential Determinantal Point Processes. Requires passing a kernel::Kernel
as a keyword argument.
InducingPoints.kmeans_seeding
— MethodFast and efficient seeding for KMeans based on Fast and Provably Good Seeding for k-Means
InducingPoints.partial_pivoted_cholesky
— Methodpartial_pivoted_cholesky(C::AbstractMatrix{<:Real}, M::Int, tol::Real)
Computes the partial pivoted Cholesky factorisation of C
. Runs for at most M
steps, but will stop if the threshold provided by tol
is met.
This implementation is directly translated from Algorithm 1 in [1].
[1] - L. Foster, A. Waagen, N. Aijaz, M. Hurley, A. Luis, J. Rinsky, C. Satyavolu, M. J. Way, P. Gazis, and A. Srivastava. Stable and Efficient Gaussian Process Calculations. Journal of Machine Learning Research, 2009.
InducingPoints.updateZ
— FunctionupdateZ([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; kwargs...)
Return new vector of inducing points Z
with data X
and algorithm alg
without changing the original one
InducingPoints.updateZ!
— FunctionupdateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OnIPSA, X::AbstractVector; [kwargs...])
Update inducing points Z
with data X
and algorithm alg
. Requires additional keyword arguments for some algorithms. Also see InducingPoints
.
InducingPoints.updateZ
— MethodupdateZ!([rng::AbstractRNG], Z::AbstractVector, alg::OIPS, X::AbstractVector; kernel::Kernel)
Update inducing points Z
with data X
and the OnlineIPSelection algorithm. Requires the kernel
as an additional keyword argument.