Random mapping

When the data vectors are high-dimensional it is computationally infeasible to use data analysis or pattern recognition algorithms which repeatedly compute similarities or distances in the original data space. It is therefore necessary to reduce the dimensionality before, for example, clustering the data. Random Mapping (RM) is a fast dimensionality reduction method categorized as feature extraction method. The RM consists in generation of a random matrix that is multiplied by each original vector and result in a reduced vector. In Text mining context, it is demonstrated that the document classification accuracy obtained after the dimensionality has been reduced using a random mapping method will be almost as good as the original accuracy if the final dimensionality is sufficiently large (about 100 out of 6000). In fact, it can be shown that the inner product (similarity) between the mapped vectors follows closely the inner product of the original vectors.

References

Kaski, S. Dimensionality reduction by random mapping: fast similarity computation for clustering. Proceedings of The 1998 IEEE International Joint Conference on Neural Networks, 1998. pp. 413–418. doi: 10.1109/IJCNN.1998.682302

This article is issued from Wikipedia - version of the 11/23/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Random mapping

See also

References