Georg Zeller (HD-HuB) Tom Hancocks, Rob Finn, Lorna Richardson, Varsha Kale, Chris Quince, Josephine Burgin, Alexandre Almeida, Martin Beracochea, Sebastien Raguideau

Monday 2 - Friday 6 November 2020


This course will cover the metagenomics data analysis workflow from the point of newly generated sequence data. Participants will explore the use of publicly available resources and tools to manage, share, analyse and interpret metagenomics data. The content will include issues of data quality control and how to submit to public repositories. While sessions will detail marker-gene and whole-genome shotgun (WGS) approaches; the primary focus will be on assembly-based approaches. Discussions will also explore considerations when assembling genome data, the analysis that can be carried out by MGnify on such datasets, and what downstream analysis options and tools are available.

This course is aimed at life scientists who are working in the field of metagenomics, in the early stages of their data analysis, and who may already have some prior experience in using bioinformatics in their research.

Further details on program are provided here:

Learning goals:
After this course participants should be able to:
    - Conduct appropriate quality control and decontamination of metagenomic data and run simple assembly pipelines on short read data
    - Utilise public datasets and resources to identify relevant data for analysis
    - Apply appropriate tools in the analysis of metagenomic data
    - Submit metagenomics data to online repositories for sharing and future analysis
    - Apply relevant knowledge in strain resolution and comparative metagenome analysis to their own research

Some practical sessions in the course require a basic understanding of the Unix command line and the R statistics package. If you are not already familiar with these then please ensure that you complete these free tutorials before you attend the course:
    - Basic introduction to the Unix environment:
    - Basic R concept tutorials:

metagenomics, bioinformatics, data analysis, R, Unix

- Microbiome data types: Amplicon approaches (ribosomal RNA), whole genome shotgun (WGS) approaches, assembly and metagenome assembled genomes (MAGs)
- Data analysis: MGnify, HMMER, InterPro, Gene Ontology (GO), FASTQC, SIAMCAT, and pathway analyses
- Data standards and submission:
     - European Nucleotide Archive (ENA)
     - Genomic Standards Consortium (GSC)
     - Sequence Read Archive (SRA)
     - Webin
- Metagenomics data analysis workflow

Georg Zeller
This email address is being protected from spambots. You need JavaScript enabled to view it.