Dimensions

Computers can only work with numerical data, and the same is true for machine learning models. Features need to be represented into tensors and mapped into high-dimensional space for processing.

For example, a 28x28 grayscale image will be flatten into a single matrix of $28 \times 28 = 784$ dimensions, which contains where each value represents a pixel's intensity, from pure black (0) to pure white (255).

python

[[  0 128 255 …  87  92  14]
 [ 24  58 119 … 160  92  77]
 [210  45 200 …  56 183 134]
 …
 [120  92 250 …  34 101  23]
 [255  35  76 …  12 141  67]
 [ 23 145 176 …  87  59 204]]

A colored image would typically need a third dimension to include 3 values pro pixel (i.e. one for each RGB canal).

python

[[[  0 128 255] [ 24  58 119] [210  45 200] ...]
 [[160  92  77] [ 92  77 130] [ 56 183 134] ...]
 [255  35  76](255%20%2035%20%2076)

Manifold hypothesis

The Manifold hypothesis suggests that, within a high-dimensional space, data tends to lie on a lower-dimensional manifold (a curved surface) that is sufficient to capture patterns and relationships.

For example, in a 28x28 image of a letter, not all 784 dimensions are necessary to recognize the character. Despite variations, like handwriting styles or cases, a smaller number of dimensions are typically enough to capture the essential features and understand the underlying pattern.

Curse of dimensionality

The curse of dimensionality refers to the challenges high-dimensional spaces create for data analysis, such as sparsity and issues like overfitting.

Dimensionality reduction

Dimensionality reduction is the process of reducing the number of dimensions (i.e. features, variables) in a dataset while trying to preserve as much important information as possible.

Techniques

Principal Component Analysis (PCA)

Sorting

Genetic algorithms

Graph algorithms

Problems

Representation model

Other

Sysml

UML

Behaviour-diagrams

Structural-diagrams

Paradigms

Assets

Quality Attributes

Binary

Data structure

Heap

NoSQL

Data types

Cloud

Glossary

Glossary

Operating System

Learning paradigms

Neural Network

Linear algebra

Tensor

Physics

Dimensions ​

Manifold hypothesis ​

Curse of dimensionality ​

Dimensionality reduction ​

Techniques ​

Dimensions

Manifold hypothesis

Curse of dimensionality

Dimensionality reduction

Techniques