Cloud Center Tübingen

A cloud solution for bioinformaticians is being provided through the university compute center in Tübingen, including computing infrastructure, various workflow solutions to construct pipelines and tools with a particular focus on the analysis of high-throughput data (genomics, proteomics, metabolomics, etc.). The provided cloud environment involves the following services:

  • Infrastructure as a Service (IaaS): The cloud infrastructure can be accessed via various virtualization technologies such as virtual machines, Docker containers or Singularity containers. The user will be able to use computing resources via the UNICORE middleware to run resource consuming data analysis and simulation algorithms. Furthermore, users can set up web services in their virtualization environment and use the available resources for their applications.

  • Platform as a Service (PaaS): we provide direct access to established pipelines for high-throughput data analysis as well as to workflow environments (Galaxy, KNIME, UNICORE) to execute these workflows in a cloud environment. A particular focus of the Tübingen site are workflows for the analysis mass spectrometric data and multi-omics data analysis. The user will be able to use the existing frameworks and workflows and further customize it on its own to achieve complete and reproducible data analysis workflows.

  • Software as a Service (SaaS): We provide a broad range of standard bioinformatics that can be applied without any further development. The preinstalled software will be provided as virtual machine images, Docker containers, Singularity containers, UNICORE or Galaxy workflows. The wealth of tools developed within The Center for Integrative Bioinformatics cibi are fully available and supported on the Tübingen site. They cover tools for high-throughput data (genomics, metagenomics, proteomics, metaproteomics, transcriptomics, metabolomics – SeqAn, OpenMS, MetFrag) as well as for image analysis (FIJI). The cloud site of Tübingen is focused, among other things, on the reproducibility of research data and their virtual research environments as Tübingen is a project partner of the CiTAR project.

  • Data as a Service (DaaS): access to data pools of important large-scale datasets is a highly valuable resource for bioinformatics users of all expertise levels. These data pools serve for development, testing and benchmarking of bioinformatics methods and the cloud environment. Depending on their need, de.NBI cloud users are granted authorized access to big data sets, such as public reference data sets. The Tübingen site focuses on data for proteomics, metabolomics, and multi-omics data (e.g., PRIDE, CPTAC, TCGA, ICGC).

tue cloud

One major aim of the de.NBI cloud site in Tübingen is to provide software, covering different scientific fields of research such as mass spectrometry analysis, NGS analysis pipelines but also molecular docking via Ball integrated into Galaxy workflows (ballaxy).

The de.NBI cloud infrastructure in Tübingen comprises more than 1650 compute cores, 15 TByte RAM, 17 TByte SSD storage and 180 TByte storage capacity.

Cloud Center Freiburg

The Freibuger cloud solution for the German Center of Bioinformatics Infrastructure is provided through the university compute center in Freiburg. This includes computing infrastructure, a local mirror of reference genomes and indices, software stacks for all bioinformatic research areas offered as Conda packages or Containers and a Galaxy server for accessible and reproducible research combining all this in a user-friendly way. The provided cloud environment involves the following services:

  • Infrastructure as a Service (IaaS): The cloud infrastructure can be accessed via various virtualization technologies such as pre-build or personalised virtual machines, Docker- or Singularity containers. The user will be able to use computing resources via the HT-Condor, Moab or the Galaxy API to run resource consuming data analysis or simulations. Furthermore, users can set up web services in their virtualization environment and use the available resources for their applications.

  • Platform as a Service (PaaS): The Galaxy workflow systems will be the central Gateway to complex frameworks and analysis pipelines. Ranging from Cheminformatics, Metabolomics and Genomics to Imaging. The user will be able to use existing and community curated workflows and further customize it on its own.

  • Software as a Service (SaaS): The Freiburg node is heavily involved in the BioConda and BioContainers initiative and therefore offers software suites out of the box for our users. The cloud site of Freiburg is focused, among other things, on the reproducibility of research data and their virtual research environments as Freiburg is a project partner of the CiTAR and ViCe project.

One major aim of the de.NBI cloud site in Freiburg is to provide software, reference data, training material and computational resources to bridge the gaps between different scientific fields. Our aim is to empower the masses to participate on their own research.

The de.NBI cloud infrastructure in Tübingen comprises more than 1800 compute cores, 10 TByte RAM and 300 TByte storage capacity.

Cloud Center Bielefeld

A complete cloud solution for bioinformaticians will be available at the Center for Biotechnology (CeBiTec) at Bielefeld University, including computing and storage infrastructure (IaaS), a framework for simplified access to the cloud infrastructure (PaaS) and ready to use bioinformatic workflow solutions (SaaS) in the field of metagenomics. The provided cloud environment involves the following services:

  • Infrastructure as a Service (IaaS): highly advanced users are in full command of the tools to build workflows and pipelines that require performant infrastructure for data analysis. The users access the infrastructure via the virtualization system, such as virtual machines or Docker containers.

  • Platform as a Service (PaaS): We provide BiBiGrid as a tool for easy HPC cluster setup inside a cloud environment. It simplifies the often complex configuration and setup of typical software stacks to a single command line call.

  • Software as a Service (SaaS): One major goal of the de.NBI cloud site in Bielefeld is to make popular bioinformatics workflows available to cloud users.

 BiBiGrid overview2

Popular bioinformatics tools and pipelines will be made available by pre-configured VMs or Docker containers. The focus of the Bielefeld Service Center is microbial genomics and metagenomics, as well as postgenomics applications. Within the de.NBI cloud, data sets and preconfigured pipelines for analyses in these research areas will be provided to a wide audience. This includes in the area of metagenomics 16S rRNA based as well as WGS-based analysis workflows, including large-scale assemblies and functional annotation. For these special applications, VMs with up to 2TB of RAM can be provided.

A special focus is also the analysis of Oxford Nanopore sequence data and postgenomics applications, where we e.g. provide resources for the MetaProteomeAnalyzer Service (MetaProtServ).

The de.NBI cloud infrastructure at Bielefeld University comprises beside general purpose resources, high memory instances with local disk space (up to 2 TB RAM and 5 TB local disk space) for applications such as metagenomic assemblies. For hardware accelerated machine learning applications special VMs with access to GPUs are available.

Cloud Center Heidelberg

The Heidelberg Center of Human Bioinformatics HD-HuB, provides a cloud solution for life scientists, including rich data sets and standardized pipelines as well as bare compute infrastructure in order to meet the wide range of user requirements spanning the biological research community.

Together the HD-HuB partner institutions have experience in working with a broad range of biological datasets, developing state-of-the-art bioinformatics methods and infrastructure for their analysis. Our particularly strong expertise in processing large-scale human genomic and imaging data (e.g. PCAWG, ICGC) influences the Heidelberg de.NBI cloud's major focus on data and infrastructure. We aim at the standardization of analysis approaches to improve comparability of data sets across institutions, to enable integrative analyses and meta-analyses. We are committed to the FAIR principles of data sharing.

The major goal of the de.NBI cloud site in Heidelberg is to make state-of-the-art tools for analyzing human genomics, metagenomics, and microscopy image data widely accessible.

Services are provided in all of the "classic" cloud areas:

  • Infrastructure as a Service (IaaS): Users are in full control of their virtual infrastructure. Resources are provided not only in the form of compute via virtual machines (VMs) but also include networks, firewalls and storage. Auxiliary, we offer specialized hardware (e.g., GPUs, FPGAs and high-memory nodes) for state of the art large scale parallel scientific computing (e.g., Matrix Factorization, Deep Learning) to the community. This gives users the flexibility they need to match hardware to their requirements and efficiently develop workflows and pipelines, while optimizing for performance.

  • Platform as a Service (PaaS): Users are provided a "platform" that allows them to accomplish their scientific tasks. These are usually provided in the form of workflow frameworks such as Galaxy https://github.com/BMCV/galaxy-image-analysis, Roddy, Butler (which features multi-cloud deployment, configuration management, comprehensive monitoring, and anomaly detection as well as pre-built pipelines for genome alignment, germline and somatic variant calling, and R script execution) or the ICGC pipelines but are not limited to that; we aim to provide several VM images, each with a curated set of software focused on a specific area of bioinformatics, to allow life science researchers to get started quickly.

  • Software as a Service (SaaS): Users are provided fully established software packages or workflow systems that can be directly used and applied to their problems and data. This is ideally suited for exploratory analyses of genomics and metagenomics, imaging data and beyond. Additionally, we provide customized software and workflows in the context of sharing our expertise to cooperation partners within the de.NBI community. It is also a great resource for novice bioinformaticians and to aid teaching.

  • Data as a Service (DaaS): Users are provided access to large data pools that are essential to life science research. Those are a highly valuable resource but often cannot be transferred or stored at the users' home organization due to the size or access regulations of the data. By enabling access to these data the de.NBI cloud will drive the development of new bioinformatics methods, and facilitate benchmarking of analysis techniques on previously inaccessible datasets. Depending on their need, users can be granted authorized access to restricted data sets such as reference data sets from the EBI-ENA or common data sources such as dbGaP and GENCODE. We started to join the new ICGC cloud computing project for collaborative research. The raw and interpreted data processed from the sequences of tumours and matching normal tissues will be provided by the Heidelberg de.NBI Cloud in order to generate analytic results and to develop new tools.

 

 

The de.NBI cloud infrastructure in Heidelberg is comprised of 4096 cores, 32 TByte RAM, 8 TByte RAM in high-memory nodes, 100 TByte SSD and 2,5 PByte storage capacity. FPGAs ensure rapid processing of genomic alignments. GPUs are available for deep learning approaches on massive amounts of labeled data.

de.NBI Cloud

In life sciences today, the handling, analysis and storage of enormous amounts of data is a challenging issue. For example, new sequencing and imaging technologies result in the generation of large scale genomic and image data. Hence, an appropriate IT infrastructure is crucial to perform analyses with such large datasets and to ensure secure data access and storage. In addition, it is difficult to directly compare result data that have been processed at different sites, due to a lack in standardization of workflows. The de.NBI cloud is an excellent solution to enable integrative analyses for the entire life sciences community in Germany and the efficient use of data in research and application.

To a large extent, de.NBI will close the gap of the missing computational resources for researchers in Germany. A joined de.NBI cloud concept and infrastructure leads to the reduction in overall infrastructure and operational costs.

Federation of the de.NBI Cloud

The de.NBI cloud is a fully academic cloud, free of charge for academic users, where academic cloud centers provide storage and computing resources for locally stored data. All layers, i.e. hardware and personnel resources, IT administration and operation, as well as deployment of operating systems, frameworks and workflows are provided by the five local cloud centers.

The de.NBI cloud operates the major service levels. (1) Infrastructure as a Service (IaaS) - suited for power users that want full control of the computing environment. (2) Platform as a Service (PaaS) - provision a fully operational infrastructure and frameworks for the deployment of workflows (3) Software as a Service (SaaS) - access to preconfigured, state-of-the-art pipelines and analysis tools. Through a cloud federation concept, all five de.NBI sites are integrated into a single cloud platform. The user will be guided to the envisaged service and the suitable cloud via the central de.NBI cloud portal. The system is based on Single sign-on (SSO) and Authentication and Authorization Infrastructure (AAI).

federated cloud

The de.NBI cloud project started in 2016 and is a collaboration project between the universities of Bielefeld, Freiburg, Gießen, Heidelberg and Tübingen (see figure below). The close cooperation with the ELIXIR cloud facilitates the achievement of a high maturity degree as quickly as possible and ensures the sustainability in the international context. de.NBI cloud provides an appropriate analytics infrastructure for bioinformatics consisting of computing power and storage capacity as well as flexible workflows and analysis tools to the life science community in Germany. The de.NBI cloud places high demands on IT security concepts and user access rules. It will comprise more than 15,000 compute cores and 5 PB of storage capacity.

Contact de.NBI Cloud

  • Gießen University - This email address is being protected from spambots. You need JavaScript enabled to view it.

  • Freiburg University - This email address is being protected from spambots. You need JavaScript enabled to view it.

  • Tübingen University - This email address is being protected from spambots. You need JavaScript enabled to view it.

  • Heidelberg University - This email address is being protected from spambots. You need JavaScript enabled to view it.

  • Bielefeld University - This email address is being protected from spambots. You need JavaScript enabled to view it.