TY - GEN
T1 - CG-Hadoop
T2 - 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2013
AU - Eldawy, Ahmed
AU - Li, Yuan
AU - Mokbel, Mohamed F
AU - Janardan, Ravi
PY - 2013
Y1 - 2013
N2 - Hadoop, employing the MapReduce programming paradigm, has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not truly exploited towards processing large-scale computational geometry operations. This paper introduces CG-Hadoop; a suite of scalable and efficient MapReduce algorithms for various fundamental computational geometry problems, namely, polygon union, skyline, convex hull, farthest pair, and closest pair, which present a set of key components for other geometric algorithms. For each computational geometry operation, CG-Hadoop has two versions, one for the Apache Hadoop system and one for the SpatialHadoop system; a Hadoop-based system that is more suited for spatial operations. These proposed algorithms form a nucleus of a comprehensive MapReduce library of computational geometry operations. Extensive experimental results on a cluster of 25 machines of datasets up to 128GB show that CG-Hadoop achieves up to 29x and 260x better performance than traditional algorithms when using Hadoop and SpatialHadoop systems, respectively.
AB - Hadoop, employing the MapReduce programming paradigm, has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not truly exploited towards processing large-scale computational geometry operations. This paper introduces CG-Hadoop; a suite of scalable and efficient MapReduce algorithms for various fundamental computational geometry problems, namely, polygon union, skyline, convex hull, farthest pair, and closest pair, which present a set of key components for other geometric algorithms. For each computational geometry operation, CG-Hadoop has two versions, one for the Apache Hadoop system and one for the SpatialHadoop system; a Hadoop-based system that is more suited for spatial operations. These proposed algorithms form a nucleus of a comprehensive MapReduce library of computational geometry operations. Extensive experimental results on a cluster of 25 machines of datasets up to 128GB show that CG-Hadoop achieves up to 29x and 260x better performance than traditional algorithms when using Hadoop and SpatialHadoop systems, respectively.
KW - Hadoop
KW - MapReduce
KW - geometric algorithms
UR - http://www.scopus.com/inward/record.url?scp=84893518674&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893518674&partnerID=8YFLogxK
U2 - 10.1145/2525314.2525349
DO - 10.1145/2525314.2525349
M3 - Conference contribution
AN - SCOPUS:84893518674
SN - 9781450325219
T3 - GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems
SP - 284
EP - 293
BT - 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2013
Y2 - 5 November 2013 through 8 November 2013
ER -