Evaluating Dimension Reduction Techniques for Linear and Nonlinear Data Structures
Tuesday, Aug 5: 8:50 AM - 9:05 AM
1496
Contributed Papers
Music City Center
Dimension reduction techniques play a significant role in analyzing high-dimensional data, especially in fields like radiomics, where extracting meaningful patterns from complex datasets is essential. This study evaluates the performance of Principal Component Analysis (PCA), Isomap, and t-Distributed Stochastic Neighbor Embedding (t-SNE) in preserving data structure based on average silhouette scores. Through extensive simulations, we compare these methods across datasets with varying sample sizes (n = 100, 200, 300, 400, 500), noise levels (σ² = 0.25, 0.5, 0.75, 1, 1.5, 2), and feature counts (p = 20, 50, 100, 200, 300, 400). Our findings indicate that for datasets with an underlying linear structure, PCA achieves the highest accuracy in maintaining cluster integrity, as measured by the average silhouette score. Conversely, for nonlinear data structures, Isomap and t-SNE outperform PCA in preserving meaningful relationships.
One important application of these findings is in radiomics, where high-dimensional imaging data is used to extract quantitative biomarkers for cancer diagnosis and prognosis.
Dimension Reductions Techniques
Linear and Nonlinear Data Structures
Radiomics
Principal Component Analysis (PCA)
Isomap
t-Distributed Stochastic Neighbor Embedding (t-SNE)
Main Sponsor
Section on Statistical Computing
You have unsaved changes.