Dynamically negotiating capacity between on-demand and batch clusters

Feng Liu, Kate Keahey, Pierre Riteau, Jon B Weissman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

In the era of rapid experimental expansion data analysis needs are rapidly outpacing the capabilities of small institutional clusters and looking to integrate HPC resources into their workflow. We propose one way of reconciling on-demand needs of experimental analytics with the batch managed HPC resources within a system that dynamically moves nodes between an on-demand cluster configured with cloud technology (OpenStack) and a traditional HPC cluster managed by a batch scheduler (Torque). We evaluate this system experimentally both in the context of real-life traces representing two years of a specific institutional need, and via experiments in the context of synthetic traces that capture generalized characteristics of potential batch and on-demand workloads. Our results for the real-life scenario show that our approach could reduce the current investment in on-demand infrastructure by 82% while at the same time improving the mean batch wait time almost by an order of magnitude (8x).

Original languageEnglish (US)
Title of host publicationProceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages493-503
Number of pages11
ISBN (Electronic)9781538683842
DOIs
StatePublished - Jul 2 2018
Event2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018 - Dallas, United States
Duration: Nov 11 2018Nov 16 2018

Publication series

NameProceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018

Conference

Conference2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
Country/TerritoryUnited States
CityDallas
Period11/11/1811/16/18

Bibliographical note

Publisher Copyright:
© 2018 IEEE.

Keywords

  • Computers and information processing
  • Distributed computing
  • Grid computing
  • Metacomputing

Fingerprint

Dive into the research topics of 'Dynamically negotiating capacity between on-demand and batch clusters'. Together they form a unique fingerprint.

Cite this