VidedUp: An application-aware framework for video de-duplication

Atul Katiyar, Jon Weissman

Research output: Contribution to conferencePaperpeer-review

32 Scopus citations

Abstract

Key to the compression-capability of a data deduplication system is the definition of redundancy. Traditionally, two data items are considered redundant if their underlying bit-streams are identical. However, this notion of redundancy is too strict for many applications. For example, for a video storage platform, two videos encoded in different formats would be unique at the system level but redundant at the content level. Intuitively, introducing application-level intelligence in redundancy detection can yield improved data compression. We propose ViDeDup (Video De-Duplication), a novel framework for video de-duplication based on an application-level view of redundancy. The framework goes beyond duplicate data detection to similarity-detection, thereby providing application-level knobs for defining acceptable level of noise during replica detection. Our results show that by trading CPU for storage, a 45% reduction in storage space could be achieved, in comparison to 8% yielded by system level de-duplication for a dataset collected from video sharing sites on the Web. We also present tradeoff analysis for various tunable parameters of the system to optimally tune the system for performance, compression and quality.

Original languageEnglish (US)
StatePublished - 2011
Event3rd USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2011 - Portland, United States
Duration: Jun 14 2011 → …

Conference

Conference3rd USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2011
Country/TerritoryUnited States
CityPortland
Period6/14/11 → …

Bibliographical note

Publisher Copyright:
© USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2011.All right reserved.

Fingerprint

Dive into the research topics of 'VidedUp: An application-aware framework for video de-duplication'. Together they form a unique fingerprint.

Cite this