Abstract
Large-scale real-time analytics services continuously collect and analyze data from end-user applications and devices distributed around the globe. Such analytics requires data to be transferred over the wide-area network (WAN) to data centers (DCs) capable of processing the data. Since WAN bandwidth is expensive and scarce, it is beneficial to reduce WAN traffic by partially aggregating the data closer to end-users. We propose aggregation networks for performing aggregation on a geo-distributed edge-cloud infrastructure consisting of edge servers, transit and destination DCs. We identify a rich set of research questions aimed at reducing the traffic costs in an aggregation network. We present an optimization formulation for solving these questions in a principled manner, and use insights from the optimization solutions to propose an efficient, near-optimal practical heuristic. We implement the heuristic in AggNet, built on top of Apache Flink. We evaluate our approach using a geo-distributed deployment on Amazon EC2 as well as a WAN-emulated local testbed. Our evaluation using real-world traces from Twitter and Akamai shows that our approach is able to achieve 47% to 83% reduction in traffic cost over existing baselines without any compromise in timeliness.
Original language | English (US) |
---|---|
Title of host publication | 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 297-311 |
Number of pages | 15 |
ISBN (Electronic) | 9781450383905 |
DOIs | |
State | Published - 2021 |
Event | 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 - San Jose, United States Duration: Dec 14 2021 → Dec 17 2021 |
Publication series
Name | 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 |
---|
Conference
Conference | 6th ACM/IEEE Symposium on Edge Computing, SEC 2021 |
---|---|
Country/Territory | United States |
City | San Jose |
Period | 12/14/21 → 12/17/21 |
Bibliographical note
Publisher Copyright:© 2021 ACM.
Keywords
- Cloud
- Edge
- Geo-distributed systems
- Stream processing