Passive network performance estimation for large-scale, data-intensive computing

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Distributed computing applications are increasingly utilizing distributed data sources. However, the unpredictable cost of data access in large-scale computing infrastructures can lead to severe performance bottlenecks. Providing predictability in data access is, thus, essential to accommodate the large set of newly emerging large-scale, data-intensive computing applications. In this regard, accurate estimation of network performance is crucial to meeting the performance goals of such applications. Passive estimation based on past measurements is attractive for its relatively small overhead compared to relying on explicit probing. In this paper, we take a passive approach for network performance estimation. Our approach is different from existing passive techniques that rely either on past direct measurements of pairs of nodes or on topological similarities. Instead, we exploit secondhand measurements collected by other nodes without any topological restrictions. In this paper, we present Overlay Passive Estimation of Network performance (OPEN), a scalable framework providing end-to-end network performance estimation based on secondhand measurements, and discuss how OPEN achieves cost-effective estimation in a large-scale infrastructure. Our extensive experimental results show that OPEN estimation can be applicable for replica and resource selections commonly used in distributed computing.

Original languageEnglish (US)
Article number5629337
Pages (from-to)1365-1373
Number of pages9
JournalIEEE Transactions on Parallel and Distributed Systems
Volume22
Issue number8
DOIs
StatePublished - 2011

Bibliographical note

Funding Information:
The authors are grateful to the anonymous reviewers for their constructive comments. This work was supported in part by US National Science Foundation grant CNS-0643505 and IIS-0916425. Appendices, which can be found on the Computer Society Digital Library at http:// doi.ieeecomputersociety.org/10.1109/TPDS.2010.201, for additional details and extended experimental results are also available from Digital Library with the electronic version of the paper.

Keywords

  • Network performance estimation
  • data-intensive computing
  • replica selection
  • resource selection
  • secondhand estimation

Fingerprint

Dive into the research topics of 'Passive network performance estimation for large-scale, data-intensive computing'. Together they form a unique fingerprint.

Cite this