Enabling cloud bursting for life sciences within Galaxy.

Afgan E, Coraor N, Chilton J, Baker D, Taylor J.
Concurrency and Computation: Practice and Experience (CCPE). June 2015; 27(16):4330-4343
Links: CCPE

Fueled by the radically increased capacity to generate data over the past decade, the field of biomedical research has been constrained by the ability to analyze data. Galaxy, a Web-based, open-source data integration and analysis platform for life science research, has been democratizing access to data analysis tools. However, the scale of data and the scope of tools required have proven to be a significant challenge for any monolithic deployment of the Galaxy application. We have found that a distributed and federated approach to utilizing compute and storage resources is necessary. This paper describes the ongoing efforts in creating a ubiquitous platform capable of simultaneously utilizing dedicated as well as on-demand cloud resources. Specifically, the requirements, process, and an implementation of a cloud-bursting system are detailed.