Kernel-based approximations of the Laplace-Beltrami operator, such as Diffusion Maps, can be employed to construct meaningful, low-dimensional representations of data sets by embedding them into the first few eigenfunctions of the operator. In this approach, it is common to use the entire data set simultaneously to form the kernel matrix. This is demanding with respect to computer memory and computation time if the number of data points is very large and the ambient space dimension is very high---to the point where it becomes prohibitive to employ kernel methods entirely.\par We discuss a technique to alleviate this challenge and make kernel-based approaches for manifold learning applicable to large data sets. We employ a Multi-Level Monte Carlo approach to obtain a numerical estimate of the number of randomly chosen points necessary to accurately estimate the operator. We show convergence of the algorithm in several controlled environments, and demonstrate that manifold learning with the estimate is faster than training an auto-encoder network on a real data set.
«
Kernel-based approximations of the Laplace-Beltrami operator, such as Diffusion Maps, can be employed to construct meaningful, low-dimensional representations of data sets by embedding them into the first few eigenfunctions of the operator. In this approach, it is common to use the entire data set simultaneously to form the kernel matrix. This is demanding with respect to computer memory and computation time if the number of data points is very large and the ambient space dimension is very high-...
»