Volume 63, pp. 496-539, 2025.
Fast evaluation of additive kernels: feature arrangement, Fourier methods, and kernel derivatives
Theresa Wagner, Franziska Nestler, and Martin Stoll
Abstract
One of the main computational bottlenecks in kernel-based learning is to cope with the large, and typically dense, kernel matrix. Techniques dealing with fast approximations of the matrix-vector product for these kernel matrices typically deteriorate in their performance if the feature vectors lie in higher-dimensional feature spaces. Here we present a technique together with a rigorous error analysis based on the non-equispaced fast Fourier transform (NFFT) that allows fast and accurate approximations of the matrix-vector product with the resulting kernel matrices. We show that this approach is also well suited for the approximation of the matrix that arises when the kernel is differentiated with respect to the kernel hyperparameters, a problem often found in the training phase of methods such as Gaussian processes. We also provide an error analysis for this case. In the numerical experiments, we illustrate the performance of the additive kernel scheme with Fourier-based fast matrix-vector products on a number of data sets when using kernel ridge regression.
Full Text (PDF) [1.3 MB], BibTeX , DOI: 10.1553/etna_vol63s496
Key words
additive kernels, feature grouping, Fourier analysis, kernel derivatives, multiple kernel learning
AMS subject classifications
65T50, 65F99, 47B32