39: A stochastic approach to k-nearest neighbors search using a fixed radius method
Tuesday, Aug 5: 10:30 AM - 12:20 PM
0995
Contributed Posters
Music City Center
This study aims to optimize the k-nearest neighbors search (kNN search) by reducing the computational burden of the Brute-force method while providing the same solution. Our method leverages data structures and probabilistic assumptions to enhance the scalability of the search. By focusing on the Training set, we define a sample space that limits the k-nearest neighbors search to a smaller space. For each observation in the Query set a fixed radius search is employed, with the radius stochastically linked to the desired number of neighbors. This approach allows us to find the k-nearest neighbors using only a fraction of the entire Training set. Through simulations and a theoretical complexity analysis, we demonstrate that our method outperforms the Brute-force approach, particularly when the Training and Query set sample sizes are large. In addition, a benchmark on an Alzheimer's disease data set further demonstrated this, showing a 62.5-fold improvement in CPU time. Overall, our stochastic approach significantly reduces the computational load of kNN search while maintaining accuracy, making it a viable alternative to traditional methods for large datasets.
kNN search
Computational efficiency
fixed-radius search
Branch and bound
Machine learning
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.