A first look at inter-data center traffic characteristics via Yahoo! datasets

Yingying Chen, Sourabh Jain, Vijay Kumar Adhikari, Zhi-Li Zhang, Kuai Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

183 Scopus citations

Abstract

Effectively managing multiple data centers and their traffic dynamics pose many challenges to their operators, as little is known about the characteristics of inter-data center (D2D) traffic. In this paper we present a first study of D2D traffic characteristics using the anonymized NetFlow datasets collected at the border routers of five major Yahoo! data centers. Our contributions are mainly two-fold: i) we develop novel heuristics to infer the Yahoo! IP addresses and localize their locations from the anonymized NetFlow datasets, and ii) we study and analyze both D2D and client traffic characteristics and the correlations between these two types of traffic. Our study reveals that Yahoo! uses a hierarchical way of deploying data centers, with several satellite data centers distributed in other countries and backbone data centers distributed in US locations. For Yahoo! US data centers, we separate the client-triggered D2D traffic and background D2D traffic from the aggregate D2D traffic using port based correlation, and study their respective characteristics. Our findings shed light on the interplay of multiple data centers and their traffic dynamics within a large content provider, and provide insights to data center designers and operators as well as researchers.

Original languageEnglish (US)
Title of host publication2011 Proceedings IEEE INFOCOM
Pages1620-1628
Number of pages9
DOIs
StatePublished - 2011
EventIEEE INFOCOM 2011 - Shanghai, China
Duration: Apr 10 2011Apr 15 2011

Publication series

NameProceedings - IEEE INFOCOM
ISSN (Print)0743-166X

Other

OtherIEEE INFOCOM 2011
Country/TerritoryChina
CityShanghai
Period4/10/114/15/11

Keywords

  • Anonymization
  • Content provider
  • Inter-data center
  • NetFlow

Fingerprint

Dive into the research topics of 'A first look at inter-data center traffic characteristics via Yahoo! datasets'. Together they form a unique fingerprint.

Cite this