Clustering high-dimensional data via random sampling and consensus

Panagiotis A. Traganitis; Konstantinos Slavakis; Georgios B Giannakis

doi:10.1109/GlobalSIP.2014.7032128

Clustering high-dimensional data via random sampling and consensus

Panagiotis A. Traganitis, Konstantinos Slavakis, Georgios B Giannakis

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.

Original language	English (US)
Title of host publication	2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	307-311
Number of pages	5
ISBN (Electronic)	9781479970889
DOIs	https://doi.org/10.1109/GlobalSIP.2014.7032128
State	Published - Feb 5 2014
Event	2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 - Atlanta, United States Duration: Dec 3 2014 → Dec 5 2014

Publication series

Name	2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014

Other

Other	2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014
Country/Territory	United States
City	Atlanta
Period	12/3/14 → 12/5/14

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Keywords

Clustering
Feature selection
High-dimensional data
K-means
Random sampling and consensus

Access

10.1109/GlobalSIP.2014.7032128

OpenUrl availability

Full text

Cite this

Traganitis, P. A., Slavakis, K., & Giannakis, G. B. (2014). Clustering high-dimensional data via random sampling and consensus. In 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 (pp. 307-311). Article 7032128 (2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/GlobalSIP.2014.7032128

Clustering high-dimensional data via random sampling and consensus. / Traganitis, Panagiotis A.; Slavakis, Konstantinos; Giannakis, Georgios B.
2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 307-311 7032128 (2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Traganitis, PA, Slavakis, K & Giannakis, GB 2014, Clustering high-dimensional data via random sampling and consensus. in 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014., 7032128, 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014, Institute of Electrical and Electronics Engineers Inc., pp. 307-311, 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014, Atlanta, United States, 12/3/14. https://doi.org/10.1109/GlobalSIP.2014.7032128

Traganitis PA, Slavakis K, Giannakis GB. Clustering high-dimensional data via random sampling and consensus. In 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 307-311. 7032128. (2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014). doi: 10.1109/GlobalSIP.2014.7032128

Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B. / Clustering high-dimensional data via random sampling and consensus. 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 307-311 (2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014).

@inproceedings{549933b2e87848aaa8aae15a850ee6b5,

title = "Clustering high-dimensional data via random sampling and consensus",

abstract = "In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.",

keywords = "Clustering, Feature selection, High-dimensional data, K-means, Random sampling and consensus",

author = "Traganitis, {Panagiotis A.} and Konstantinos Slavakis and Giannakis, {Georgios B}",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.; 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 ; Conference date: 03-12-2014 Through 05-12-2014",

year = "2014",

month = feb,

day = "5",

doi = "10.1109/GlobalSIP.2014.7032128",

language = "English (US)",

series = "2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "307--311",

booktitle = "2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014",

}

TY - GEN

T1 - Clustering high-dimensional data via random sampling and consensus

AU - Traganitis, Panagiotis A.

AU - Slavakis, Konstantinos

AU - Giannakis, Georgios B

PY - 2014/2/5

Y1 - 2014/2/5

N2 - In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.

AB - In response to the urgent need for learning tools tuned to big data analytics, the present paper introduces a feature selection approach to efficient clustering of high-dimensional vectors. The resultant method leverages random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to yield novel dimensionality reduction schemes. The advocated random sampling and consensus K-means (RSC-Kmeans) algorithm can operate in either batch or sequential modes, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.

KW - Clustering

KW - Feature selection

KW - High-dimensional data

KW - K-means

KW - Random sampling and consensus

UR - http://www.scopus.com/inward/record.url?scp=84949929328&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949929328&partnerID=8YFLogxK

U2 - 10.1109/GlobalSIP.2014.7032128

DO - 10.1109/GlobalSIP.2014.7032128

M3 - Conference contribution

AN - SCOPUS:84949929328

T3 - 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014

SP - 307

EP - 311

BT - 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014

Y2 - 3 December 2014 through 5 December 2014

ER -

Clustering high-dimensional data via random sampling and consensus

Abstract

Publication series

Other

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this