Kernel Functions

Base Kernels

These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions.

Constant Kernels

KernelFunctions.WhiteKernelType
WhiteKernel()

White noise kernel.

Definition

For inputs $x, x'$, the white noise kernel is defined as

\[k(x, x') = \delta(x, x').\]

source

Cosine Kernel

KernelFunctions.CosineKernelType
CosineKernel()

Cosine kernel.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the cosine kernel is defined as

\[k(x, x') = \cos(\pi \|x-x'\|_2).\]

source

Exponential Kernels

Exponentiated Kernel

KernelFunctions.ExponentiatedKernelType
ExponentiatedKernel()

Exponentiated kernel.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the exponentiated kernel is defined as

\[k(x, x') = \exp(x^\top x').\]

source

Fractional Brownian Motion Kernel

KernelFunctions.FBMKernelType
FBMKernel(; h::Real=0.5)

Fractional Brownian motion kernel with Hurst index h.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the fractional Brownian motion kernel with Hurst index $h \in [0,1]$ is defined as

\[k(x, x'; h) = \frac{\|x\|_2^{2h} + \|x'\|_2^{2h} - \|x - x'\|^{2h}}{2}.\]

source

Gabor Kernel

KernelFunctions.GaborKernelType
GaborKernel(; ell::Real=1.0, p::Real=1.0)

Gabor kernel with lengthscale ell and period p.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the Gabor kernel with lengthscale $l_i > 0$ and period $p_i > 0$ is defined as

\[k(x, x'; l, p) = \exp\bigg(- \cos\bigg(\pi\sum_{i=1}^d \frac{x_i - x'_i}{p_i}\bigg) \sum_{i=1}^d \frac{(x_i - x'_i)^2}{l_i^2}\bigg).\]

source

Matérn Kernels

KernelFunctions.MaternKernelType
MaternKernel(; ν::Real=1.5)

Matérn kernel of order ν.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the Matérn kernel of order $\nu > 0$ is defined as

\[k(x,x';\nu) = \frac{2^{1-\nu}}{\Gamma(\nu)}\big(\sqrt{2\nu}\|x-x'\|_2\big) K_\nu\big(\sqrt{2\nu}\|x-x'\|_2\big),\]

where $\Gamma$ is the Gamma function and $K_{\nu}$ is the modified Bessel function of the second kind of order $\nu$.

A Gaussian process with a Matérn kernel is $\lceil \nu \rceil - 1$-times differentiable in the mean-square sense.

See also: Matern12Kernel, Matern32Kernel, Matern52Kernel

source
KernelFunctions.Matern32KernelType
Matern32Kernel()

Matérn kernel of order $3/2$.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the Matérn kernel of order $3/2$ is given by

\[k(x, x') = \big(1 + \sqrt{3} \|x - x'\|_2 \big) \exp\big(- \sqrt{3}\|x - x'\|_2\big).\]

See also: MaternKernel

source
KernelFunctions.Matern52KernelType
Matern52Kernel()

Matérn kernel of order $5/2$.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the Matérn kernel of order $5/2$ is given by

\[k(x, x') = \bigg(1 + \sqrt{5} \|x - x'\|_2 + \frac{5}{3}\|x - x'\|_2^2\bigg) \exp\big(- \sqrt{5}\|x - x'\|_2\big).\]

See also: MaternKernel

source

Neural Network Kernel

KernelFunctions.NeuralNetworkKernelType
NeuralNetworkKernel()

Kernel of a Gaussian process obtained as the limit of a Bayesian neural network with a single hidden layer as the number of units goes to infinity.

Definition

Consider the single-layer Bayesian neural network $f \colon \mathbb{R}^d \to \mathbb{R}$ with $h$ hidden units defined by

\[f(x; b, v, u) = b + \sqrt{\frac{\pi}{2}} \sum_{i=1}^{h} v_i \mathrm{erf}\big(u_i^\top x\big),\]

where $\mathrm{erf}$ is the error function, and with prior distributions

\[\begin{aligned} b &\sim \mathcal{N}(0, \sigma_b^2),\\ v &\sim \mathcal{N}(0, \sigma_v^2 \mathrm{I}_{h}/h),\\ u_i &\sim \mathcal{N}(0, \mathrm{I}_{d}/2) \qquad (i = 1,\ldots,h). \end{aligned}\]

As $h \to \infty$, the neural network converges to the Gaussian process

\[g(\cdot) \sim \mathcal{GP}\big(0, \sigma_b^2 + \sigma_v^2 k(\cdot, \cdot)\big),\]

where the neural network kernel $k$ is given by

\[k(x, x') = \arcsin\left(\frac{x^\top x'}{\sqrt{\big(1 + \|x\|^2_2\big) \big(1 + \|x'\|_2^2\big)}}\right)\]

for inputs $x, x' \in \mathbb{R}^d$.[CW]

source

Periodic Kernel

KernelFunctions.PeriodicKernelType
PeriodicKernel(; r::AbstractVector=ones(Float64, 1))

Periodic kernel with parameter r.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the periodic kernel with parameter $r_i > 0$ is defined[DM] as

\[k(x, x'; r) = \exp\bigg(- \frac{1}{2} \sum_{i=1}^d \bigg(\frac{\sin\big(\pi(x_i - x'_i)\big)}{r_i}\bigg)^2\bigg).\]

source

Piecewise Polynomial Kernel

KernelFunctions.PiecewisePolynomialKernelType
PiecewisePolynomialKernel(; degree::Int=0, dim::Int)
PiecewisePolynomialKernel{degree}(dim::Int)

Piecewise polynomial kernel of degree degree for inputs of dimension dim with support in the unit ball.

Definition

For inputs $x, x' \in \mathbb{R}^d$ of dimension $d$, the piecewise polynomial kernel of degree $v \in \{0,1,2,3\}$ is defined as

\[k(x, x'; v) = \max(1 - \|x - x'\|, 0)^{\alpha(v,d)} f_{v,d}(\|x - x'\|),\]

where $\alpha(v, d) = \lfloor \frac{d}{2}\rfloor + 2v + 1$ and $f_{v,d}$ are polynomials of degree $v$ given by

\[\begin{aligned} f_{0,d}(r) &= 1, \\ f_{1,d}(r) &= 1 + (j + 1) r, \\ f_{2,d}(r) &= 1 + (j + 2) r + \big((j^2 + 4j + 3) / 3\big) r^2, \\ f_{3,d}(r) &= 1 + (j + 3) r + \big((6 j^2 + 36j + 45) / 15\big) r^2 + \big((j^3 + 9 j^2 + 23j + 15) / 15\big) r^3, \end{aligned}\]

where $j = \lfloor \frac{d}{2}\rfloor + v + 1$.

The kernel is $2v$ times continuously differentiable and the corresponding Gaussian process is hence $v$ times mean-square differentiable.

source

Polynomial Kernels

KernelFunctions.LinearKernelType
LinearKernel(; c::Real=0.0)

Linear kernel with constant offset c.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the linear kernel with constant offset $c \geq 0$ is defined as

\[k(x, x'; c) = x^\top x' + c.\]

See also: PolynomialKernel

source
KernelFunctions.PolynomialKernelType
PolynomialKernel(; degree::Int=2, c::Real=0.0)

Polynomial kernel of degree degree with constant offset c.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the polynomial kernel of degree $\nu \in \mathbb{N}$ with constant offset $c \geq 0$ is defined as

\[k(x, x'; c, \nu) = (x^\top x' + c)^\nu.\]

See also: LinearKernel

source

Rational Quadratic Kernels

KernelFunctions.GammaRationalQuadraticKernelType
GammaRationalQuadraticKernel(; α::Real=2.0, γ::Real=2.0)

γ-rational-quadratic kernel with shape parameters α and γ.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the γ-rational-quadratic kernel with shape parameters $\alpha > 0$ and $\gamma \in (0, 2]$ is defined as

\[k(x, x'; \alpha, \gamma) = \bigg(1 + \frac{\|x - x'\|_2^{\gamma}}{\alpha}\bigg)^{-\alpha}.\]

The GammaExponentialKernel is recovered in the limit as $\alpha \to \infty$.

See also: RationalQuadraticKernel

source

Spectral Mixture Kernels

KernelFunctions.spectral_mixture_kernelFunction
spectral_mixture_kernel(
    h::Kernel=SqExponentialKernel(),
    αs::AbstractVector{<:Real},
    γs::AbstractMatrix{<:Real},
    ωs::AbstractMatrix{<:Real},
)

where αs are the weights of dimension (A, ), γs is the covariance matrix of dimension (D, A) and ωs are the mean vectors and is of dimension (D, A). Here, D is input dimension and A is the number of spectral components.

h is the kernel, which defaults to SqExponentialKernel if not specified.

Generalised Spectral Mixture kernel function. This family of functions is dense in the family of stationary real-valued kernels with respect to the pointwise convergence.[1]

\[ κ(x, y) = αs' (h(-(γs' * t)^2) .* cos(π * ωs' * t), t = x - y\]

References:

[1] Generalized Spectral Kernels, by Yves-Laurent Kom Samo and Stephen J. Roberts
[2] SM: Gaussian Process Kernels for Pattern Discovery and Extrapolation,
        ICML, 2013, by Andrew Gordon Wilson and Ryan Prescott Adams,
[3] Covariance kernels for fast automatic pattern discovery and extrapolation
    with Gaussian processes, Andrew Gordon Wilson, PhD Thesis, January 2014.
    http://www.cs.cmu.edu/~andrewgw/andrewgwthesis.pdf
[4] http://www.cs.cmu.edu/~andrewgw/pattern/.
source
KernelFunctions.spectral_mixture_product_kernelFunction
spectral_mixture_product_kernel(
    h::Kernel=SqExponentialKernel(),
    αs::AbstractMatrix{<:Real},
    γs::AbstractMatrix{<:Real},
    ωs::AbstractMatrix{<:Real},
)

where αs are the weights of dimension (D, A), γs is the covariance matrix of dimension (D, A) and ωs are the mean vectors and is of dimension (D, A). Here, D is input dimension and A is the number of spectral components.

Spectral Mixture Product Kernel. With enough components A, the SMP kernel can model any product kernel to arbitrary precision, and is flexible even with a small number of components [1]

h is the kernel, which defaults to SqExponentialKernel if not specified.

\[ κ(x, y) = Πᵢ₌₁ᴷ Σ(αsᵢᵀ .* (h(-(γsᵢᵀ * tᵢ)²) .* cos(ωsᵢᵀ * tᵢ))), tᵢ = xᵢ - yᵢ\]

References:

[1] GPatt: Fast Multidimensional Pattern Extrapolation with GPs,
    arXiv 1310.5288, 2013, by Andrew Gordon Wilson, Elad Gilboa,
    Arye Nehorai and John P. Cunningham
source

Wiener Kernel

KernelFunctions.WienerKernelType
WienerKernel(; i::Int=0)
WienerKernel{i}()

The i-times integrated Wiener process kernel function.

Definition

For inputs $x, x' \in \mathbb{R}^d$, the $i$-times integrated Wiener process kernel with $i \in \{-1, 0, 1, 2, 3\}$ is defined[SDH] as

\[k_i(x, x') = \begin{cases} \delta(x, x') & \text{if } i=-1,\\ \min\big(\|x\|_2, \|x'\|_2\big) & \text{if } i=0,\\ a_{i1}^{-1} \min\big(\|x\|_2, \|x'\|_2\big)^{2i + 1} + a_{i2}^{-1} \|x - x'\|_2 r_i\big(\|x\|_2, \|x'\|_2\big) \min\big(\|x\|_2, \|x'\|_2\big)^{i + 1} & \text{otherwise}, \end{cases}\]

where the coefficients $a$ are given by

\[a = \begin{bmatrix} 3 & 2 \\ 20 & 12 \\ 252 & 720 \end{bmatrix}\]

and the functions $r_i$ are defined as

\[\begin{aligned} r_1(t, t') &= 1,\\ r_2(t, t') &= t + t' - \frac{\min(t, t')}{2},\\ r_3(t, t') &= 5 \max(t, t')^2 + 2 tt' + 3 \min(t, t')^2. \end{aligned}\]

The WhiteKernel is recovered for $i = -1$.

source

Composite Kernels

The modular design of KernelFunctions uses base kernels as building blocks for more complex kernels. There are a variety of composite kernels implemented, including those which transform the inputs to a wrapped kernel to implement length scales, scale the variance of a kernel, and sum or multiply collections of kernels together.

KernelFunctions.TransformedKernelType
TransformedKernel(k::Kernel, t::Transform)

Kernel derived from k for which inputs are transformed via a Transform t.

It is preferred to create kernels with input transformations with transform instead of TransformedKernel directly since transform allows optimized implementations for specific kernels and transformations.

Definition

For inputs $x, x'$, the transformed kernel $\widetilde{k}$ derived from kernel $k$ by input transformation $t$ is defined as

\[\widetilde{k}(x, x'; k, t) = k\big(t(x), t(x')\big).\]

source
KernelFunctions.ScaledKernelType
ScaledKernel(k::Kernel, σ²::Real=1.0)

Scaled kernel derived from k by multiplication with variance σ².

Definition

For inputs $x, x'$, the scaled kernel $\widetilde{k}$ derived from kernel $k$ by multiplication with variance $\sigma^2 > 0$ is defined as

\[\widetilde{k}(x, x'; k, \sigma^2) = \sigma^2 k(x, x').\]

source
KernelFunctions.KernelSumType
KernelSum <: Kernel

Create a sum of kernels. One can also use the operator +.

There are various ways in which you create a KernelSum:

The simplest way to specify a KernelSum would be to use the overloaded + operator. This is equivalent to creating a KernelSum by specifying the kernels as the arguments to the constructor.

julia> k1 = SqExponentialKernel(); k2 = LinearKernel(); X = rand(5);

julia> (k = k1 + k2) == KernelSum(k1, k2)
true

julia> kernelmatrix(k1 + k2, X) == kernelmatrix(k1, X) .+ kernelmatrix(k2, X)
true

julia> kernelmatrix(k, X) == kernelmatrix(k1 + k2, X)
true

You could also specify a KernelSum by providing a Tuple or a Vector of the kernels to be summed. We suggest you to use a Tuple when you have fewer components and a Vector when dealing with a large number of components.

julia> KernelSum((k1, k2)) == k1 + k2
true

julia> KernelSum([k1, k2]) == KernelSum((k1, k2)) == k1 + k2
true
source
KernelFunctions.KernelProductType
KernelProduct <: Kernel

Create a product of kernels. One can also use the overloaded operator *.

There are various ways in which you create a KernelProduct:

The simplest way to specify a KernelProduct would be to use the overloaded * operator. This is equivalent to creating a KernelProduct by specifying the kernels as the arguments to the constructor.

julia> k1 = SqExponentialKernel(); k2 = LinearKernel(); X = rand(5);

julia> (k = k1 * k2) == KernelProduct(k1, k2)
true

julia> kernelmatrix(k1 * k2, X) == kernelmatrix(k1, X) .* kernelmatrix(k2, X)
true

julia> kernelmatrix(k, X) == kernelmatrix(k1 * k2, X)
true

You could also specify a KernelProduct by providing a Tuple or a Vector of the kernels to be multiplied. We suggest you to use a Tuple when you have fewer components and a Vector when dealing with a large number of components.

julia> KernelProduct((k1, k2)) == k1 * k2
true

julia> KernelProduct([k1, k2]) == KernelProduct((k1, k2)) == k1 * k2
true
source
KernelFunctions.KernelTensorProductType
KernelTensorProduct

Tensor product of kernels.

Definition

For inputs $x = (x_1, \ldots, x_n)$ and $x' = (x'_1, \ldots, x'_n)$, the tensor product of kernels $k_1, \ldots, k_n$ is defined as

\[k(x, x'; k_1, \ldots, k_n) = \Big(\bigotimes_{i=1}^n k_i\Big)(x, x') = \prod_{i=1}^n k_i(x_i, x'_i).\]

Construction

The simplest way to specify a KernelTensorProduct is to use the overloaded tensor operator or its alias (can be typed by \otimes<tab>).

julia> k1 = SqExponentialKernel(); k2 = LinearKernel(); X = rand(5, 2);

julia> kernelmatrix(k1 ⊗ k2, RowVecs(X)) == kernelmatrix(k1, X[:, 1]) .* kernelmatrix(k2, X[:, 2])
true

You can also specify a KernelTensorProduct by providing kernels as individual arguments or as an iterable data structure such as a Tuple or a Vector. Using a tuple or individual arguments guarantees that KernelTensorProduct is concretely typed but might lead to large compilation times if the number of kernels is large.

julia> KernelTensorProduct(k1, k2) == k1 ⊗ k2
true

julia> KernelTensorProduct((k1, k2)) == k1 ⊗ k2
true

julia> KernelTensorProduct([k1, k2]) == k1 ⊗ k2
true
source

Multi-output Kernels

KernelFunctions.IndependentMOKernelType
IndependentMOKernel(k::Kernel)

Kernel for multiple independent outputs with kernel k each.

Definition

For inputs $x, x'$ and output dimensions $p_x, p_{x'}'$, the kernel $\widetilde{k}$ for independent outputs with kernel $k$ each is defined as

\[\widetilde{k}\big((x, p_x), (x', p_{x'})\big) = \begin{cases} k(x, x') & \text{if } p_x = p_{x'}, \\ 0 & \text{otherwise}. \end{cases}\]

Mathematically, it is equivalent to a matrix-valued kernel defined as

\[\widetilde{K}(x, x') = \mathrm{diag}\big(k(x, x'), \ldots, k(x, x')\big) \in \mathbb{R}^{m \times m},\]

where $m$ is the number of outputs.

source
KernelFunctions.LatentFactorMOKernelType
LatentFactorMOKernel(g, e::MOKernel, A::AbstractMatrix)

Kernel associated with the semiparametric latent factor model.

Definition

For inputs $x, x'$ and output dimensions $p_x, p_{x'}'$, the kernel is defined as[STJ]

\[k\big((x, p_x), (x, p_{x'})\big) = \sum^{Q}_{q=1} A_{p_xq}g_q(x, x')A_{p_{x'}q} + e\big((x, p_x), (x', p_{x'})\big),\]

where $g_1, \ldots, g_Q$ are $Q$ kernels, one for each latent process, $e$ is a multi-output kernel for $m$ outputs, and $A$ is a matrix of weights for the kernels of size $m \times Q$.

source
  • RWC. E. Rasmussen & C. K. I. Williams (2006). Gaussian Processes for Machine Learning.
  • CWC. K. I. Williams (1998). Computation with infinite neural networks.
  • DMD. J. C. MacKay (1998). Introduction to Gaussian Processes.
  • SDHSchober, Duvenaud & Hennig (2014). Probabilistic ODE Solvers with Runge-Kutta Means.
  • STJM. Seeger, Y. Teh, & M. I. Jordan (2005). Semiparametric Latent Factor Models.