Intro
InducingPoints.jl aims at providing an easy way to select inducing points locations for Sparse Gaussian Processes both in an online and offline setting. These are used most prominently in sparse GP regression (see e.g. `ApproximateGPs.jl)
Quickstart
InducingPoints.jl provides the following list of algorithms. For details on the specific usage see the algorithms section.
All algorithms inherit from AbstractInducingPointsSelection or AIPSA which can be passed to the different APIs.
Offline Inducing Points Selection
These algorithms are designed to compute inducing points for a data set that is likely to remain unchanged. If the data set changes, the algorithms have to be rerun from scratch.
alg = KMeansAlg(10)
Z = inducingpoints(alg, X; kwargs...)The Offline options are:
KmeansAlg: Use the k-means algorithm to select centroids minimizing the square distance with the dataset. The seeding is done viak-means++. Note that the inducing points are not going to be a subset of the data.kDPP: Sample from a k-Determinantal Point Process to selectkpoints.Zwill be a subset ofX.StdDPP: Sample from a standard Determinantal Point Process. The number of inducing points is not fixed here.Zwill be a subset ofX.RandomSubset: Sample randomlykpoints from the data set uniformly.Greedy: Will select a subset ofXwhich maximizes theELBO(in a stochastic way).CoverTree: Will build a tree to select the optimal nodes covering the data.
Online Inducing Points Selection
Online selection algorithms compute an initial set similarly to the offline methods via inducingpoints. For successive changes of the data sets, InducingPoints.jl allows for efficient updating via updateZ!.
alg = OIPS()
Z = inducingpoints(alg, x_1; kwargs...)
for x in eachbatch(X)
updateZ!(Z, alg, x; kwargs...)
endThe Online options are:
OnlineIPSelection: A method based on distance between inducing points and dataUniGrid: A regularly-spaced grid whom edges are adapted given the data. Uses memory efficient custom typeUniformGrid.SeqDPP: Sequential Determinantal Point Processes, subsets are regularly sampled from the new data batches conditioned on the existing inducing points.StreamKmeans: An online version of k-means.Webscale: Another online version of k-means
Index
InducingPoints.CoverTreeInducingPoints.GreedyInducingPoints.GreedyVarSelectionInducingPoints.KmeansAlgInducingPoints.OnlineIPSelectionInducingPoints.RandomSubsetInducingPoints.SeqDPPInducingPoints.StdDPPInducingPoints.StreamKmeansInducingPoints.UniGridInducingPoints.UniformGridInducingPoints.WebscaleInducingPoints.kDPPInducingPoints.find_nearest_centerInducingPoints.inducingpointsInducingPoints.inducingpointsInducingPoints.inducingpointsInducingPoints.inducingpointsInducingPoints.inducingpointsInducingPoints.inducingpointsInducingPoints.kmeans_seedingInducingPoints.partial_pivoted_choleskyInducingPoints.updateZInducingPoints.updateZInducingPoints.updateZ!