Abstract
In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
Original language | English (US) |
---|---|
Title of host publication | 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 307-311 |
Number of pages | 5 |
ISBN (Electronic) | 9781479970889 |
DOIs | |
State | Published - Feb 5 2014 |
Event | 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 - Atlanta, United States Duration: Dec 3 2014 → Dec 5 2014 |
Publication series
Name | 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 |
---|
Other
Other | 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 |
---|---|
Country/Territory | United States |
City | Atlanta |
Period | 12/3/14 → 12/5/14 |
Bibliographical note
Publisher Copyright:© 2014 IEEE.
Keywords
- Clustering
- Feature selection
- High-dimensional data
- K-means
- Random sampling and consensus