EMBL Use Case
Next Generation DNA Sequencing technologies have had a huge impact on how biological and medical research are performed today. It has helped large-scale sequencing to become affordable and revolutionized the sequencing of complete genomes. Scientists today easily generate huge amounts of quality sequence data within a few days. The analysis and management of these vast data sets, however, require high performance computing and fast data storage infrastructures as well as bioinformatics expertise, which are often challenging for many labs.
The European Molecular Biology Laboratory (EMBL) is developing a portal for cloud-supported analysis of large and complex genomes. This will facilitate genomic assembly and annotation, allowing a deeper insight into evolution and biodiversity across a range of organisms.
“The quantities of genomic sequence data are vast and the need for high performance computing infrastructures and bioinformatics expertise to analyse these data poses a challenge for many laboratories. EMBL’s novel cloud-based whole-genome-assembly and annotation pipeline involves expertise from the Genomics Core facility in Germany, EMBL’s European Bioinformatics Institute, and EMBL Heidelberg's IT Services. It will allow scientists, at EMBL and around the world, to overcome these hurdles and provide the right infrastructure on demand” said Rupert Lueck, head of IT services at EMBL.
What we want to achieve
- Open up new possibilities for scientists to perform large-scale genomic analysis without making large capital investments in computing infrastructure, thereby making de novo assembly and genome annotation affordable to many more laboratories,
- Provide a leading bioinformatics pipeline to perform fast and on-demand genomic data analysis,
- Provide a basis for future extension of genomic research using cloud computing infrastructures.
How will the project assist the scientific community?
The long-term objective is to make large-scale genomic data analysis widely and economically available to the scientific community by removing the infrastructure hurdles in the way of these projects. Building on this platform, EMBL also looks to expand Helix Nebula – the Science Cloud into its other research domains. High-throughput microscopy is just another widely used technology that involves the generation and computational analysis of big data. EMBL generally wants to explore how its scientists, facing large-scale data analysis challenges, can benefit from cloud computing. Beyond basic research EMBL’s large-scale genomic analysis platform could certainly attract and might be adapted to research in the medical field or the pharmaceutical and agricultural industries. The cloud computing model certainly provides the power and scalability to make this service available to the widest possible user base.