Share this page:

High-Performance Big Data Management Across Cloud Data Centers

on the December 10, 2014

PhD. defense of Radu-Marius Tudoran (ENS Rennes - IRISA / KerData).
Computer Science

The easily-accessible computation power offered by cloud infrastructures coupled with the revolution of Big Data are expanding the scale and speed at which data analysis is performed. The cloud resources for computation and storage are spread among globally distributed data centers. Enabling fast data transfers in such scenarios becomes particularly important for scientific applications for which moving the processing close to data is rather expensive or not feasible. Analyzing how clouds can become Big Data - friendly, and what are the best options to provide data-oriented cloud services to address applications needs are the key goals of this thesis. In this talk, we present our contributions for providing high performance data management for applications running across multiple cloud data centers. We start by focusing on the scalability aspects of single-site processing and show how the MapReduce model can be extended across multi-sites. Next, we present a transfer service architecture that enables configurable cost-performance optimizations for inter-site transfers. This transfer scheme is then leveraged in the context of real-time streaming across cloud data centers. Finally, we investigate the viability of leveraging this data movement solution as a cloud-provided service, following a Transfer-as-a-Service paradigm based on a flexible pricing scheme.
Research - Commercialisation

Date of update February 17, 2015