Manifold learning by spectral embedding is a technique that can be used for non-linear dimensionality reduction and clustering. By extracting the spectral properties of high dimensional data, the intrinsic manifold where data is presumably located on, can be embedded into a lower dimension. A newly proposed algorithm in the field of spectral embedding that has the goal of providing a scalable and robust approach to dimensionality reduction is Roseland by Chao Shen and Hau-Tieng Wu. The algorithm requires two sets: the data set and the landmark set and returns an embedding of the data set in a desired lower dimension. To achieve this, a landmark-set affinity matrix is computed that represents the affinities between the points in the data set and the points in the landmark set. This matrix is then normalized, and the singular value decomposition of the normalized matrix is evaluated. Using the singular vectors and the singular values, the Roseland embedding for a given diffusion time is finally computed. At its core, the algorithm is similar to the Diffusion Maps algorithm, whereas the main differences lie in the affinity matrix and the algorithm for spectral decomposition. In Diffusion Maps, the affinities between the data points themselves are calculated, without first ”detouring” through a landmark set. Instead of the singular value decomposition, the eigendecomposition is performed. If the number of landmarks is significantly smaller than the data set, Roseland fits the data set faster than Diffusion Maps. In this thesis, we describe an efficient implementation of the Roseland algorithm in the datafold package. We consider different approaches to constructing the landmark set when it is not provided, and we compare the results. Finally, we evaluate the efficiency of the novel algorithm by comparing it to the performance of Diffusion Maps.
«
Manifold learning by spectral embedding is a technique that can be used for non-linear dimensionality reduction and clustering. By extracting the spectral properties of high dimensional data, the intrinsic manifold where data is presumably located on, can be embedded into a lower dimension. A newly proposed algorithm in the field of spectral embedding that has the goal of providing a scalable and robust approach to dimensionality reduction is Roseland by Chao Shen and Hau-Tieng Wu. The algorithm...
»