Kernel Functions

Base Kernels

These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions.

Constant Kernels

KernelFunctions.ZeroKernel — Type

ZeroKernel()

Zero kernel.

Definition

For inputs $x, x'$, the zero kernel is defined as

\[k(x, x') = 0.\]

The output type depends on $x$ and $x'$.

Cosine Kernel

KernelFunctions.CosineKernel — Type

CosineKernel(; metric=Euclidean())

Cosine kernel with respect to the metric.

Definition

For inputs $x, x'$ and metric $d(\cdot, \cdot)$, the cosine kernel is defined as

\[k(x, x') = \cos(\pi d(x, x')).\]

By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

source

Exponential Kernels

KernelFunctions.ExponentialKernel — Type

ExponentialKernel(; metric=Euclidean())

Exponential kernel with respect to the metric.

Definition

For inputs $x, x'$ and metric $d(\cdot, \cdot)$, the exponential kernel is defined as

\[k(x, x') = \exp\big(- d(x, x')\big).\]

By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

Exponentiated Kernel

KernelFunctions.ExponentiatedKernel — Type

ExponentiatedKernel()

Exponentiated kernel.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the exponentiated kernel is defined as

\[k(x, x') = \exp(x^\top x').\]

source

Fractional Brownian Motion Kernel

KernelFunctions.FBMKernel — Type

FBMKernel(; h::Real=0.5)

Fractional Brownian motion kernel with Hurst index h.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the fractional Brownian motion kernel with Hurst index $h \in [0,1]$ is defined as

\[k(x, x'; h) = \frac{\|x\|_2^{2h} + \|x'\|_2^{2h} - \|x - x'\|^{2h}}{2}.\]

source

Gabor Kernel

KernelFunctions.gaborkernel — Function

gaborkernel(;
    sqexponential_transform=IdentityTransform(), cosine_tranform=IdentityTransform()
)

Construct a Gabor kernel with transformations sqexponential_transform and cosine_transform of the inputs of the underlying squared exponential and cosine kernel, respectively.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the Gabor kernel with transformations $f$ and $g$ of the inputs to the squared exponential and cosine kernel, respectively, is defined as

\[k(x, x'; f, g) = \exp\bigg(- \frac{\| f(x) - f(x')\|_2^2}{2}\bigg) \cos\big(\pi \|g(x) - g(x')\|_2 \big).\]

source

Matérn Kernels

KernelFunctions.MaternKernel — Type

MaternKernel(; ν::Real=1.5, metric=Euclidean())

Matérn kernel of order ν with respect to the metric.

Definition

For inputs $x, x'$ and metric $d(\cdot, \cdot)$, the Matérn kernel of order $\nu > 0$ is defined as

\[k(x,x';\nu) = \frac{2^{1-\nu}}{\Gamma(\nu)}\big(\sqrt{2\nu} d(x, x')\big) K_\nu\big(\sqrt{2\nu} d(x, x')\big),\]

where $\Gamma$ is the Gamma function and $K_{\nu}$ is the modified Bessel function of the second kind of order $\nu$. By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

A Gaussian process with a Matérn kernel is $\lceil \nu \rceil - 1$-times differentiable in the mean-square sense.

Note

Differentiation with respect to the order ν is not currently supported.

source

KernelFunctions.Matern12Kernel — Type

Matern12Kernel()

Alias of ExponentialKernel.

source

KernelFunctions.Matern32Kernel — Type

Matern32Kernel(; metric=Euclidean())

Matérn kernel of order $3/2$ with respect to the metric.

Definition

For inputs $x, x'$ and metric $d(\cdot, \cdot)$, the Matérn kernel of order $3/2$ is given by

\[k(x, x') = \big(1 + \sqrt{3} d(x, x') \big) \exp\big(- \sqrt{3} d(x, x') \big).\]

By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

Neural Network Kernel

KernelFunctions.NeuralNetworkKernel — Type

NeuralNetworkKernel()

Kernel of a Gaussian process obtained as the limit of a Bayesian neural network with a single hidden layer as the number of units goes to infinity.

Definition

Consider the single-layer Bayesian neural network $f \colon \mathbb{R}^d \to \mathbb{R}$ with $h$ hidden units defined by

\[f(x; b, v, u) = b + \sqrt{\frac{\pi}{2}} \sum_{i=1}^{h} v_i \mathrm{erf}\big(u_i^\top x\big),\]

where $\mathrm{erf}$ is the error function, and with prior distributions

\[\begin{aligned} b &\sim \mathcal{N}(0, \sigma_b^2),\\ v &\sim \mathcal{N}(0, \sigma_v^2 \mathrm{I}_{h}/h),\\ u_i &\sim \mathcal{N}(0, \mathrm{I}_{d}/2) \qquad (i = 1,\ldots,h). \end{aligned}\]

As $h \to \infty$, the neural network converges to the Gaussian process

\[g(\cdot) \sim \mathcal{GP}\big(0, \sigma_b^2 + \sigma_v^2 k(\cdot, \cdot)\big),\]

where the neural network kernel $k$ is given by

\[k(x, x') = \arcsin\left(\frac{x^\top x'}{\sqrt{\big(1 + \|x\|^2_2\big) \big(1 + \|x'\|_2^2\big)}}\right)\]

for inputs $x, x' \in \mathbb{R}^d$.^[CW]

source

Periodic Kernel

KernelFunctions.PeriodicKernel — Type

PeriodicKernel(; r::AbstractVector=ones(Float64, 1))

Periodic kernel with parameter r.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the periodic kernel with parameter $r_i > 0$ is defined^[DM] as

\[k(x, x'; r) = \exp\bigg(- \frac{1}{2} \sum_{i=1}^d \bigg(\frac{\sin\big(\pi(x_i - x'_i)\big)}{r_i}\bigg)^2\bigg).\]

source

KernelFunctions.PeriodicKernel — Method

PeriodicKernel([T=Float64, dims::Int=1])

Create a PeriodicKernel with parameter r=ones(T, dims).

source

Piecewise Polynomial Kernel

KernelFunctions.PiecewisePolynomialKernel — Type

PiecewisePolynomialKernel(; dim::Int, degree::Int=0, metric=Euclidean())
PiecewisePolynomialKernel{degree}(; dim::Int, metric=Euclidean())

Piecewise polynomial kernel of degree degree for inputs of dimension dim with support in the unit ball with respect to the metric.

Definition

For inputs $x, x'$ of dimension $m$ and metric $d(\cdot, \cdot)$, the piecewise polynomial kernel of degree $v \in \{0,1,2,3\}$ is defined as

\[k(x, x'; v) = \max(1 - d(x, x'), 0)^{\alpha(v,m)} f_{v,m}(d(x, x')),\]

where $\alpha(v, m) = \lfloor \frac{m}{2}\rfloor + 2v + 1$ and $f_{v,m}$ are polynomials of degree $v$ given by

\[\begin{aligned} f_{0,m}(r) &= 1, \\ f_{1,m}(r) &= 1 + (j + 1) r, \\ f_{2,m}(r) &= 1 + (j + 2) r + \big((j^2 + 4j + 3) / 3\big) r^2, \\ f_{3,m}(r) &= 1 + (j + 3) r + \big((6 j^2 + 36j + 45) / 15\big) r^2 + \big((j^3 + 9 j^2 + 23j + 15) / 15\big) r^3, \end{aligned}\]

where $j = \lfloor \frac{m}{2}\rfloor + v + 1$. By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

The kernel is $2v$ times continuously differentiable and the corresponding Gaussian process is hence $v$ times mean-square differentiable.

source

Polynomial Kernels

KernelFunctions.LinearKernel — Type

LinearKernel(; c::Real=0.0)

Linear kernel with constant offset c.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the linear kernel with constant offset $c \geq 0$ is defined as

\[k(x, x'; c) = x^\top x' + c.\]

Rational Kernels

KernelFunctions.RationalKernel — Type

RationalKernel(; α::Real=2.0, metric=Euclidean())

Rational kernel with shape parameter α and given metric.

Definition

For inputs $x, x'$ and metric $d(\cdot, \cdot)$, the rational kernel with shape parameter $\alpha > 0$ is defined as

\[k(x, x'; \alpha) = \bigg(1 + \frac{d(x, x')}{\alpha}\bigg)^{-\alpha}.\]

By default, $d$ is the Euclidean metric $d(x, x') = \|x - x'\|_2$.

The ExponentialKernel is recovered in the limit as $\alpha \to \infty$.

Spectral Mixture Kernels

KernelFunctions.spectral_mixture_kernel — Function

spectral_mixture_kernel(
    h::Kernel=SqExponentialKernel(),
    αs::AbstractVector{<:Real},
    γs::AbstractMatrix{<:Real},
    ωs::AbstractMatrix{<:Real},
)

where αs are the weights of dimension (A, ), γs is the covariance matrix of dimension (D, A) and ωs are the mean vectors and is of dimension (D, A). Here, D is input dimension and A is the number of spectral components.

h is the kernel, which defaults to SqExponentialKernel if not specified.

Warning

If you want to make sure that the constructor is type-stable, you should provide StaticArrays arguments: αs as a StaticVector, γs and ωs as StaticMatrix.

Generalised Spectral Mixture kernel function. This family of functions is dense in the family of stationary real-valued kernels with respect to the pointwise convergence.[1]

\[ κ(x, y) = αs' (h(-(γs' * t)^2) .* cos(π * ωs' * t), t = x - y\]

References:

[1] Generalized Spectral Kernels, by Yves-Laurent Kom Samo and Stephen J. Roberts
[2] SM: Gaussian Process Kernels for Pattern Discovery and Extrapolation,
        ICML, 2013, by Andrew Gordon Wilson and Ryan Prescott Adams,
[3] Covariance kernels for fast automatic pattern discovery and extrapolation
    with Gaussian processes, Andrew Gordon Wilson, PhD Thesis, January 2014.
    http://www.cs.cmu.edu/~andrewgw/andrewgwthesis.pdf
[4] http://www.cs.cmu.edu/~andrewgw/pattern/.

source

KernelFunctions.spectral_mixture_product_kernel — Function

spectral_mixture_product_kernel(
    h::Kernel=SqExponentialKernel(),
    αs::AbstractMatrix{<:Real},
    γs::AbstractMatrix{<:Real},
    ωs::AbstractMatrix{<:Real},
)

where αs are the weights of dimension (D, A), γs is the covariance matrix of dimension (D, A) and ωs are the mean vectors and is of dimension (D, A). Here, D is input dimension and A is the number of spectral components.

Spectral Mixture Product Kernel. With enough components A, the SMP kernel can model any product kernel to arbitrary precision, and is flexible even with a small number of components [1]

h is the kernel, which defaults to SqExponentialKernel if not specified.

\[ κ(x, y) = Πᵢ₌₁ᴷ Σ(αsᵢᵀ .* (h(-(γsᵢᵀ * tᵢ)²) .* cos(ωsᵢᵀ * tᵢ))), tᵢ = xᵢ - yᵢ\]

References:

[1] GPatt: Fast Multidimensional Pattern Extrapolation with GPs,
    arXiv 1310.5288, 2013, by Andrew Gordon Wilson, Elad Gilboa,
    Arye Nehorai and John P. Cunningham

source

Wiener Kernel

KernelFunctions.WienerKernel — Type

WienerKernel(; i::Int=0)
WienerKernel{i}()

The i-times integrated Wiener process kernel function.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the $i$-times integrated Wiener process kernel with $i \in \{-1, 0, 1, 2, 3\}$ is defined^[SDH] as

\[k_i(x, x') = \begin{cases} \delta(x, x') & \text{if } i=-1,\\ \min\big(\|x\|_2, \|x'\|_2\big) & \text{if } i=0,\\ a_{i1}^{-1} \min\big(\|x\|_2, \|x'\|_2\big)^{2i + 1} + a_{i2}^{-1} \|x - x'\|_2 r_i\big(\|x\|_2, \|x'\|_2\big) \min\big(\|x\|_2, \|x'\|_2\big)^{i + 1} & \text{otherwise}, \end{cases}\]

where the coefficients $a$ are given by

\[a = \begin{bmatrix} 3 & 2 \\ 20 & 12 \\ 252 & 720 \end{bmatrix}\]

and the functions $r_i$ are defined as

\[\begin{aligned} r_1(t, t') &= 1,\\ r_2(t, t') &= t + t' - \frac{\min(t, t')}{2},\\ r_3(t, t') &= 5 \max(t, t')^2 + 2 tt' + 3 \min(t, t')^2. \end{aligned}\]

The WhiteKernel is recovered for $i = -1$.

source

Composite Kernels

The modular design of KernelFunctions uses base kernels as building blocks for more complex kernels. There are a variety of composite kernels implemented, including those which transform the inputs to a wrapped kernel to implement length scales, scale the variance of a kernel, and sum or multiply collections of kernels together.

KernelFunctions.TransformedKernel — Type

TransformedKernel(k::Kernel, t::Transform)

Kernel derived from k for which inputs are transformed via a Transform t.

The preferred way to create kernels with input transformations is to use the composition operator ∘ or its alias compose instead of TransformedKernel directly since this allows optimized implementations for specific kernels and transformations.

Multi-output Kernels

Kernelfunctions implements multi-output kernels as scalar kernels on an extended output domain. For more details on this read the section on inputs for multi-output GPs.

For a function $f(x) \rightarrow y$ denote the inputs as $x, x'$, such that we compute the covariance between output components $y_{p}$ and $y_{p'}$. The total number of outputs is $m$.

KernelFunctions.MOKernel — Type

MOKernel

Abstract type for kernels with multiple outpus.

source

KernelFunctions.IndependentMOKernel — Type

IndependentMOKernel(k::Kernel)

Kernel for multiple independent outputs with kernel k each.

Definition

For inputs $x, x'$ and output dimensions $p, p'$, the kernel $\widetilde{k}$ for independent outputs with kernel $k$ each is defined as

\[\widetilde{k}\big((x, p), (x', p')\big) = \begin{cases} k(x, x') & \text{if } p = p', \\ 0 & \text{otherwise}. \end{cases}\]

Mathematically, it is equivalent to a matrix-valued kernel defined as

\[\widetilde{K}(x, x') = \mathrm{diag}\big(k(x, x'), \ldots, k(x, x')\big) \in \mathbb{R}^{m \times m},\]

where $m$ is the number of outputs.

source

KernelFunctions.LatentFactorMOKernel — Type

LatentFactorMOKernel(g::AbstractVector{<:Kernel}, e::MOKernel, A::AbstractMatrix)

Kernel associated with the semiparametric latent factor model.

Definition

For inputs $x, x'$ and output dimensions $p_x, p_{x'}'$, the kernel is defined as^[STJ]

\[k\big((x, p_x), (x, p_{x'})\big) = \sum^{Q}_{q=1} A_{p_xq}g_q(x, x')A_{p_{x'}q} + e\big((x, p_x), (x', p_{x'})\big),\]

where $g_1, \ldots, g_Q$ are $Q$ kernels, one for each latent process, $e$ is a multi-output kernel for $m$ outputs, and $A$ is a matrix of weights for the kernels of size $m \times Q$.

source

KernelFunctions.IntrinsicCoregionMOKernel — Type

IntrinsicCoregionMOKernel(; kernel::Kernel, B::AbstractMatrix)

Kernel associated with the intrinsic coregionalization model.

Definition

For inputs $x, x'$ and output dimensions $p, p'$, the kernel is defined as^[ARL]

\[k\big((x, p), (x', p'); B, \tilde{k}\big) = B_{p, p'} \tilde{k}\big(x, x'\big),\]

where $B$ is a positive semidefinite matrix of size $m \times m$, with $m$ being the number of outputs, and $\tilde{k}$ is a scalar-valued kernel shared by the latent processes.

source

KernelFunctions.LinearMixingModelKernel — Type

LinearMixingModelKernel(k::Kernel, H::AbstractMatrix)
LinearMixingModelKernel(Tk::AbstractVector{<:Kernel},Th::AbstractMatrix)

Kernel associated with the linear mixing model, taking a vector of Q kernels and a Q × m mixing matrix H for a function with m outputs. Also accepts a single kernel k for use across all Q basis vectors.

Definition

For inputs $x, x'$ and output dimensions $p, p'$, the kernel is defined as^[BPTHST]

\[k\big((x, p), (x, p')\big) = H_{:,p}K(x, x')H_{:,p'}\]

where $K(x, x') = Diag(k_1(x, x'), ..., k_Q(x, x'))$ with zero off-diagonal entries. $H_{:,p}$ is the $p$-th column (p-th output) of $H \in \mathbb{R}^{Q \times m}$ representing $Q$ basis vectors for the $m$ dimensional output space of $f$. $k_1, \ldots, k_Q$ are $Q$ kernels, one for each latent process, $H$ is a mixing matrix of $Q$ basis vectors spanning the output space.

source

RWC. E. Rasmussen & C. K. I. Williams (2006). Gaussian Processes for Machine Learning.
CWC. K. I. Williams (1998). Computation with infinite neural networks.
DMD. J. C. MacKay (1998). Introduction to Gaussian Processes.
SDHSchober, Duvenaud & Hennig (2014). Probabilistic ODE Solvers with Runge-Kutta Means.
STJM. Seeger, Y. Teh, & M. I. Jordan (2005). Semiparametric Latent Factor Models.
ARLM. Álvarez, L. Rosasco, & N. Lawrence (2012). Kernels for Vector-Valued Functions: a Review.
BPTHSTWessel P. Bruinsma, Eric Perim, Will Tebbutt, J. Scott Hosking, Arno Solin, Richard E. Turner (2020). Scalable Exact Inference in Multi-Output Gaussian Processes.