HD-HuB Services

mOTUs

Metagenomic operational taxonomic units (mOTUs) allow for the quantification of known (sequenced) and unknown microorganisms at species-level resolution from shotgun sequencing data. The method clusters single-copy phylogenetic marker gene sequences from metagenomes and reference genomes into mOTUs to profile their abundances in shotgun metagenomic samples. Find this service in BioTools here.
Funding: Development and maintenance partially funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

specI

SpecI is a species identification tool using genomic sequences to delineate prokaryotic species. It facilitates fast, accurate and automated taxonomic assignments of newly sequenced genomes based on comparisons of 40 universal, single-copy phylogenetic marker genes extracted from a comprehensive database of sequenced prokaryotic genomes.
Funding: Development and maintenance partially funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

MOCAT

MOCAT is a modular and scalable software pipeline for analyzing shotgun metagenomics datasets generated with Illumina technology. Starting from raw fastQ files, it can quality-filter and remove contaminants from them, assemble metagenomic reads into contigs, predict prokaryotic genes on these, identify phylogenetic marker genes and generate taxonomic abundance profiles by mapping reads to these marker genes.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

Enterotyping

Enterotypes are densely populated regions in a high-dimensional space of microbiome community composition, by which human individuals can be stratified (Arumugam, Raes et al. Nature 2011). Computational methods to detect and characterise enterotypes in any dataset, either to reproduce previous reports or determine enterotypes in new studies, are provided and explained.
Funding: Development and maintenance partially funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

SIAMCAT

SIAMCAT is a modular framework for the statistical inference of associations between microbial communities and host phenotypes, such as disease states in clinical case-control studies. SIAMCAT is based on LASSO models, which offer distinctive advantages for model interpretation and microbial biomarker selection and avoid overfitting issues that can arise in naive combinations of feature selection and cross-validation.
Funding: Development and maintenance partially funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

eggNOG

eggNOG is a database of nested orthologous gene groups (NOGs) infered using unsupervised clustering applied to >2,000 complete genomes followed by comprehensive characterization and analysis of the resulting gene families. eggNOG provides orthologous group assignments at >100 different taxonomic levels as well as multiple sequence alignments, maximum-likelihood trees and broad functional annotations for each group accessible via a web interface or through bulk download.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

iTOL

Interactive Tree Of Life (iTOL) is an online tool for the display and manipulation of phylogenetic trees. It provides a large variety of tree layouts, drawing and annotation features including circular tree layout, which is well-suited particularly for mid-sized trees (up to several thousand leaves). Tree displays can be exported in several graphical formats, both bitmap and vector based.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

iPATH

iPath is a web-based tool for the visualization and analysis of cellular pathways. Based on current annotations (such as KEGG), it provides pathway maps for primary cellular metabolism as well as for some additional secondary metabolite synthesis and regulatory pathways. Users can map their own data onto these pathway maps. Due to its navigation and customization functions, iPATH thus allows users to easily explore and analyze the functional and metabolic capabilities of their (meta-)genomic data sets.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

STRING

STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a database of known and predicted protein interactions in the form of direct physical or indirect functional associations, which currently cover 9,643,763 proteins from 2,031 organisms. Interaction data are derived from high-throughput experiments, genomic context, (conserved) coexpression, and the scientific literature. STRING quantitatively integrates interaction data from these sources and transfers information between organisms where applicable.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

STITCH

STITCH is a web-based resource to explore known and predicted interactions of chemicals and proteins. Chemicals are linked to other chemicals and proteins by evidence derived from experiments, databases and the literature. STITCH contains interactions between 300,000 small molecules and 2.6 million proteins from 1133 organisms. Find this service in BioTools here.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

SMART

SMART (Simple Modular Architecture Research Tool) is a web resource providing simple identification and extensive annotation of protein domains via sequence homology searches. It contains manually curated models for more than 1,200 protein domains. In its ‘Genomic’ mode, it annotates proteins from completely sequenced genomes of 2,031 species as a basis for their functional annotation. It provides flexible tools to visually explore protein domain architectures across sequences and organisms. Find this service in BioTools here.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

SIDER

SIDER is a web-based resource that contains information on marketed medicines and their recorded adverse drug reactions. This information is extracted from public documents and package inserts. SIDER makes available side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations. It currently covers associations between 5,868 side effects and 1,430 drugs.
Funding: Neither development nor maintenance funded by de.NBI.

Website
Contact

CART

CART is a chemical annotation retrieval toolkit providing powerful text- and structure-based matching tools to map user-supplied input lists of chemicals into a comprehensive chemical space. This unified reference space facilitates annotation retrieval and enrichment analysis similar to gene ontology analysis. CART integrates data from a number of resources on chemical bioactivities including molecular targets, metabolization, therapeutic effects, side effects and toxicity. CART is available as Galaxy web service and as standalone command line tool.
Funding: Neither development nor maintenance funded by de.NBI.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

DELLY

DELLY is a workflow for the discovery of germline and somatic structural variants. Delly is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome, and includes a genotyping functionality for discriminating heterozygous from homozygous variants.
Funding: Neither development nor maintenance funded by de.NBI, funded by ICGC Pedbrain (BMBF - 01KU1201C and DKH 109252), MMML MYC SYS (BMBF 0316166F), 1000 genomes (NIH/NHGRI 1U41HG007497-01).

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

DESeq2 & DEXSeq

DESeq2 R/Bioconductor package for differential gene expression analysis based on the negative binomial distribution. Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. DEXSeq extends DESeq2 for the analysis of alternative exon usage. Find this service in BioTools here.
Funding:

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

EBImage & RBioFormats

R/Bioconductor packages that provide general purpose functionality for the reading, writing, processing and analysis of images. In the context of high-throughput microscopy based cellular assays, EBImage offers tools to transform the images, segment cells and extract quantitative cellular descriptors. Find this service in BioTools here.
Funding:

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

TPP

Analyze thermal proteome profiling (TPP) experiments with varying temperatures (TR) or compound concentrations (CCR).
Funding:

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

Workflows and recipes

With HD-HuB resources we plan to extend the offered workflows for typical analyses of human genome and multi-omic data, and to provide short atomic ‘recipes’ for frequently used tasks.
Funding:

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

OTP

OTP = "One Touch Pipeline" is a comprehensive framework for NGS project organization and processing. The application provides support in all steps of this process, including data transfer from temporary to final storage, execution of data quality monitoring programs, alignment of reads to the reference genome and variant calling. It allows full automatization, extended project administration, and full processing control for operators.
Funding:

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

roddy

roddy is a framework for large scale NGS processing

This email address is being protected from spambots. You need JavaScript enabled to view it.

Cloud/HPC

IT Infrastructure for de.NBI users based on ICGC and TCGA PanCancer technology

This email address is being protected from spambots. You need JavaScript enabled to view it.

various NGS Pipelines

Quality assessment programms for NGS, variant calling pipelines, PanCancer pipelines

This email address is being protected from spambots. You need JavaScript enabled to view it.

Patient Searchtool

Database for patient specifications presenting minimal study datasets

This email address is being protected from spambots. You need JavaScript enabled to view it.

GenomeRNAi

The GenomeRNAi database makes available RNAi phenotype data extracted from the literature, or submitted by data producers directly, for human and Drosophila. It also provides RNAi reagent information, along with an assessment as to their efficiency and specificity. Find this service in BioTools here.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

KNIME-Cellular phenotyping of microscope image data

This KNIME framework is a data mining platform including manifold libraries for image processing (based on Fiji/ImageJ) and data exploration. The platform provides access to workflows and pipelines for (large-scale) automated phenotype analysis.

This email address is being protected from spambots. You need JavaScript enabled to view it.

PanCancer alignment workflow

The alignment workflow produces aligned bam files from fastq files. The user can choose between different versions of bwa. The default settings of the workflow are implemented after the ICGC PanCancer workflow. The pipeline also produces a rich set of quality values as well as coverage plots for each genome individually and combined ones if a tumor normal pair is processed.

This email address is being protected from spambots. You need JavaScript enabled to view it.

ACESeq

ACEseq is a method to detect somatic copy number variations from matching tumor/control WGS data pairs. In addition to total copy numbers it provides allele-specific copy numbers as well as ploidy and tumor cell content estimates for the tumor sample.

This email address is being protected from spambots. You need JavaScript enabled to view it.

SNV calling pipeline

The SNV pipeline creates a set of high confidence somatic SNVs. The pipeline takes a tumor and matched control bam file as input and returns a highly annotated VCF file. In addition the workflow produces plots that help to interpret both, the sample quality (e.g. if the sample is contaminated) as well as biology (e.g. if the sample shows kataegis).

This email address is being protected from spambots. You need JavaScript enabled to view it.

IONiseR

IONiseR provides tools for the quality assessment of Oxford Nanopore MinION data. It extracts summary statistics from a set of fast5 files and can be used either before or after base calling. In addition to standard summaries of the read-types produced, it provides a number of plots for visualising metrics relative to experiment run time or spatially over the surface of a flowcell. Find this service in BioTools here.

This email address is being protected from spambots. You need JavaScript enabled to view it.

Indel calling pipeline

The indel pipeline detects high confidence indels (1-20 bp). The results are presented in an extensively annotated VCF file. The input of the pipeline are the aligned tumor and control bam files. To enable a fast quality assessment of functionally indels screen shots are taken of exotic high confidence indels.

This email address is being protected from spambots. You need JavaScript enabled to view it.

RNA-seq end-to-end workflow

end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. Starting from the FASTQ files, these are aligned to the reference genome, and a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample is prepared. Performance of exploratory data analysis (EDA) for quality assessment and exploration of the relationship between samples, performance of differential gene expression analysis, and visual exploration of the results.

This email address is being protected from spambots. You need JavaScript enabled to view it.

BUTLER

Framework for the cloud orchestration of genomics workflows.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

BiocWorkflowTools

This bioconductor authoring tool links the journal’s manuscript management system with the Bioconductor nightly build system, thus enabling “continuous Integration”, automated quality control and guaranteed workflow functionality even as underlying resources and tools evolve. A package and tutorial publication are underway, and we have actively helped several authors from the community with workflow authoring.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

biomaRt

Provides an R interface to BioMart services. In particular this is the most widely used programmatic access route to query EMBL-EBI’s Ensembl database.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

CESAM

Enables the discovery of oncogene activation events mediated by enhancer hijacking, via pan-cancer analyses.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

GenomeCRISPR

GenomeCRISPR is a database for high-throughput screening experiments performed by using the CRISPR/Cas9 system. A dynamic web interface guides users through the process of finding information about published CRISPR screens. The database holds detailed data about observed hits and phenotypes. Moreover, it provides knowledge about performance of individual single guide RNAs (sgRNAs) used under various experimental conditions.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

YAPSA

This package provides functions and routines for a supervised analysis of somatic signatures. In particular, functions to perform a signature analysis with known signatures (LCD = linear combination decomposition) and a signature analysis on stratified mutational catalogue (SMC = stratify mutational catalogue) are provided.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

BioQAnalyzer

Processing and primary analysis of data obtained in clonal bisulfite sequencing experiments.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

BioQAnalyzer HT

Processing and primary analysis of data obtained in high-throughput targeted bisulfite sequencing experiments.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

BioQAnalyzer HiMod

Processing and primary analysis of data obtained in standard targeted bisulfite sequencing experiments, including Tet-assisted (TAB-Seq) and chemical modification-based bisulfite sequencing (Ox-BS) approaches.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

EpiExplorer

Integration of multiple epigenetic and genetic annotations and making them explorable via an interactive interface.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

RnBeads

R package for comprehensive analysis of DNA methylation data obtained by Infinium microarrays and bisulfite sequencing protocols. RnBeads implements a series of QC steps and comprehensively outputs a number of analyses levels as annotated and readable hypertext report. Rnbeads is the standard preparation tool for DNA-methylation data obtained by the DEEP (German Epigenome Programme).

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.

DeepBlue Epigenomic Data Server

The DeepBlue Epigenomic Data Server provides a data access hub for large collections of epigenomic data. It organizes the data using controlled vocabularies and ontologies. The data is stored in our server, where the users can access the data programmatically or by our web interface.

Website
This email address is being protected from spambots. You need JavaScript enabled to view it.