Renato Alves (HD-HuB)
19-10-2020 - 23-10-2020
Computation is an integral part of today's research as data has grown too large or too complex to be analysed by hand. An ever-growing fraction of science is performed computationally and many wet-lab biologists spend part of their time on the computer. Many scientists struggle with this aspect of research as they have not been properly trained in the necessary set of skills. The result is that too much time is spent using inefficient tools when progress could be faster. This course provides training in several key tools, with a focus on good development practices that encourage efficient and reproducible research computing.
Topics covered include:
- Introduction to Python scripting
- Introduction to the Unix shell and usage of cluster resources
- Version control with Git and Github
- Analysis pipeline management
- Scientific Python & working with biological data
- Literate programming with Jupyter notebooks
This course aims to teach software writing skills and best practices to researchers in biology who wish to analyse data, and to introduce a toolset that can help them in their work. The goal is to enable them to be more productive and to make their science better and more reproducible.
This is a course for researchers in the life sciences who are using computers for their analyses, even if not full time. The target student will be familiar with some command line/programmatic computer usage, will want to become more confident using these tools efficiently and reproducibly. A target student will have written a for loop in some language before, but will not know what git is (or at least not be very comfortable using git).
Programming; Command Line; Version Control; Bioinformatics; Data Analysis; Cluster Computing
Python; Bash; Unix/Linux; Git; GitHub; SnakeMake; Biopython; Pandas; Numpy; SciPy; Matplotlib
Course Homepage: https://www.embl.de/training/events/2020/SWC20-01/index.html