Manifold learning (dimensionality reduction)

In machine learning, the data are usually represented in a high dimensional feature space. Nevertheless in practice, the data are restricted to a limited area of the feature space. This leads to the well known problem of the curse of dimensionality. The manifold learning techniques, also known as dimensionality reduction aim to find a mapping of the data from the high dimensional feature space to a new space of lower dimensions. The manifold learning methods estimates the geometry of the dataset locally, around each data point. The geometry is captured by the distance between each point and its nearest neighbors. The mapping to the lower dimension space tends to preserve the dataset geometry. Such approach is used by non-parametric algorithms such as Isomap and locally linear embedding (LLE). Parametric algorithms have also been developed, for example the autoencoder neural network is trained to find a compact representation (code) for each data point, such that the original representation of each point can be recovered from its code. In the lower dimension space, it is easier to visualize the data and to identify the main sources of variability. The lower dimension space can also be interpreted as a feature extraction step. Algorithms were first developed for unsupervised learning, but now most of them are modified for supervised and semi-supervised learning, in order to take advantage of the labeled data for classification.

Civimetrix Telecom logo
risq logo
University of Torontologo
MDEIE logo