Wide-area streaming analytics: Distributing the data cube

Benjamin Heintz, Abhishek Chandra, Ramesh K. Sitaraman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To date, much research in data-intensive computing has focused on batch computation. Increasingly, however, it is necessary to derive knowledge from big data streams. As a motivating example, consider a content delivery network (CDN) such as Akamai [4], comprising thousands of servers in hundreds of globally distributed locations. Each of these servers produces a stream of log data, recording for example every user it serves, along with each video stream they access, when they play and pause streams, and more. Each server also records network- and system-level data such as TCP connection statistics. In aggregate, the servers produce billions of lines of log data from over a thousand locations daily.

Original languageEnglish (US)
Title of host publicationProceedings of the 4th Annual Symposium on Cloud Computing, SoCC 2013
PublisherAssociation for Computing Machinery
ISBN (Print)9781450324281
DOIs
StatePublished - 2013
Event4th Annual Symposium on Cloud Computing, SoCC 2013 - Santa Clara, CA, United States
Duration: Oct 1 2013Oct 3 2013

Publication series

NameProceedings of the 4th Annual Symposium on Cloud Computing, SoCC 2013

Other

Other4th Annual Symposium on Cloud Computing, SoCC 2013
Country/TerritoryUnited States
CitySanta Clara, CA
Period10/1/1310/3/13

Fingerprint

Dive into the research topics of 'Wide-area streaming analytics: Distributing the data cube'. Together they form a unique fingerprint.

Cite this