PhenoMeNal: processing and analysis of metabolomics data in the cloud
Peters, K; Bradbury, J; Bergmann, S; Capuccini, M; Cascante, M; De Atauri, P; Ebbels, T; Foguet, C; Glen, R; Gonzalez-Beltran, A; Guenther, U; Handakas, E; Hankemeier, T; Herman, S; Haug, K; Holub, P; Izzo, M; Jacob, D; Johnson, D; Jourdan, F; Kale, N; Karaman, I; Khalili, B; Emami Khoonsari, P; Kultima, K; Lampa, S; Larsson, A; Ludwig, C; Moreno, P; Neumann, S; Novella, JA; O'Donovan, C; Pearce, JTM; Peluso, A; Pireddu, L; Piras, ME; Reed, MAC; Rocca-Serra, P; Roger, P; Rosato, A; Rueedi, R; Ruttkies, C; Sadawi, N; Salek, R; Sansone, S-A; Selivanov, V; Spjuth, O; Schober, D; Thévenot, E; Tomasoni, M; Van Rijswijk, M; Van Vliet, M; Viant, M; Weber, R; Zanetti, G; Steinbeck, C
Background: Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological and many other applied biological domains. Its computationally-intensive nature has driven requirements for open data formats, data repositories and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution. Findings: The PhenoMeNal (Phenome and Metabolome aNalysis) e-infrastructure provides a complete, workflow-oriented, interoperable metabolomics data analysis solution for a modern infrastructure-as-a-service (IaaS) cloud platform. PhenoMeNal seamlessly integrates a wide array of existing open source tools which are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi and Pachyderm. Conclusions: PhenoMeNal constitutes a keystone solution in cloud infrastructures available for metabolomics. It provides scientists with a ready-to-use, workflow-driven, reproducible and shareable data analysis platform harmonizing the software installation and configuration through user-friendly web interfaces. The deployed cloud environments can be dynamically scaled to enable large-scale analyses which are interfaced through standard data formats, versioned, and have been tested for reproducibility and interoperability. The flexible implementation of PhenoMeNal allows easy adaptation of the infrastructure to other application areas and 'omics research domains.