Papers
Topics
Authors
Recent
Search
2000 character limit reached

Trustworthy Dimensionality Reduction

Published 9 May 2024 in stat.ME | (2405.05868v1)

Abstract: Different unsupervised models for dimensionality reduction like PCA, LLE, Shannon's mapping, tSNE, UMAP, etc. work on different principles, hence, they are difficult to compare on the same ground. Although they are usually good for visualisation purposes, they can produce spurious patterns that are not present in the original data, losing its trustability (or credibility). On the other hand, information about some response variable (or knowledge of class labels) allows us to do supervised dimensionality reduction such as SIR, SAVE, etc. which work to reduce the data dimension without hampering its ability to explain the particular response at hand. Therefore, the reduced dataset cannot be used to further analyze its relationship with some other kind of responses, i.e., it loses its generalizability. To make a better dimensionality reduction algorithm with a better balance between these two, we shall formally describe the mathematical model used by dimensionality reduction algorithms and provide two indices to measure these intuitive concepts such as trustability and generalizability. Then, we propose a Localized Skeletonization and Dimensionality Reduction (LSDR) algorithm which approximately achieves optimality in both these indices to some extent. The proposed algorithm has been compared with state-of-the-art algorithms such as tSNE and UMAP and is found to be better overall in preserving global structure while retaining useful local information as well. We also propose some of the possible extensions of LSDR which could make this algorithm universally applicable for various types of data similar to tSNE and UMAP.

Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.