Resources to help you on your way to learning Python for biology

Having been a wet lab biologist for 5 years with very little programming knowledge (zero python, a little C++), my first task when joining the Computational Biology and Training Department (CGAT) was to develop the Python programming skills. However, knowing where to start was more problematic.

My first port of call was to buy the ‘Python for biologists’ books that are amazing introductions to the basic use of python in biology. However, I quickly realised that even these simple to understand books were far too advanced for me at the time, as I hadn’t even grasped how to use the for loop yet!.

My lack of knowledge on the simple basics of python led me to the Coursera python course, where basic principles are introduced and then the course explores some of the more advanced aspects of python, which I felt at the time were far too complicated for what I needed. However, I persisted and completed the course and it allowed me to begin my new life as a computational biologist.

However, It was only after completing the Coursera series that I discovered Codeacademy. If I had discovered this first I think that my road to becoming a python programmer would have been simpler and less complicated, as the interactive session used to teach python is really intuitive. Moreover, it covers the basic principles clearly and concisely.

I think the most significant issue when embarking on learning a programming language wasn’t actually getting access to material; it was trying to decide where to start first. Therefore, for anyone embarking on learning python for biology related purposes I would go through these sources in order:

  1. Codeacademy – this is a great free resource and introduces the principles of python perfectly.
  2. ‘Python for Biologists’ – this is an excellent introduction to building python code and then applying it to simple biological problems. – However, don’t expect too much from this book, it wont give you solutions to complicated research questions.
  3. Practical computing for biologists – Again another great resource for beginners how to use python to answer simple scientific questions.
  4. Coursera (Python programming) – This was a great course to begin with but goes into some more advanced topics that at the time I didn’t need. I would consider doing this course and then stopping when you either get bored or find that it isn’t really helping you anymore. After a year of using python I re-enrolled in this course and found the more advanced aspects of the tutorials so much more informative.

All in all, it took me a month to have a good grasp of python (I have no idea whether this is quick or slow) and about another month to start using the language to a sufficiently advanced level to be useful for my work.

Adam Cribbs (www.acribbs.co.uk)

Learning a programming language for the first time

So, having been at CGAT for 6 months I thought that it would be a good opportunity to share the problems that have occurred during my endeavour to learn a new programming language.

One of the first tasks a CGAT fellow is required to do is to learn how to program. My computational background was extremely limited, with a very basic knowledge of C++ that I learned over 6 weeks in the school summer holidays when I was about 14 (I know I was a sad child). Therefore, on my first day I asked others in the CGAT office which programming  language I should learn first,  the overwhelming response was to focus on python. However, others suggested that I also learn R as a lot of downstream data analysis is performed using this language. Not being able to decide which language to focus on first I googled the question and came over an interesting blog post setting out the pros and cons of each language. Although this was very informative it didn’t really give me the answer I was looking for so I decided to embark on learning both, dedicating one day to python then the next to R. With hindsight this wasn’t a very good decision as I found myself getting confused between the different syntaxes. As a consequence I decided to drop learning R for the time being, not because I though python would be more useful in the short term (which it actually has turned out to be) but because the online learning resources for python are much more readily available, some of which I have listed below:

Python resources that I found extremely useful when first starting out:

I think to understand the basics of python and start to write functional programs took me around a month, I’m not sure whether this is quick or slow but it felt like a good pace for me. I think I was a little naive before I joined CGAT because I thought that once you have a basic understanding of the syntax then you could simply go about writing a program to fulfil any needs. However, this soon changed after 2/3 months when I was given my first project which was to find and annotate CRISPR gRNAs sites across the genome. This has been a fantastic first project that has allowed me to develop the skills that are required to write and develop CGAT code. Although the project has progressed at a good pace, the more I have delved into writing python code the more I have realised how much more I have to learn to become proficient at writing not only functional code but code that is succinct and computationally efficient.

Following my intensive one month learning python I then decided to focus my attention on learning the basics of R. R is a fantastic tool that makes data analysis and plot generation the go-to language. It has been around for many years and one of its major bonuses is the availability of packages that you can download and use to make specialist data analysis easier. The problem I encountered early on though was the availability of good online training material, none of it seemed to give me the right level of knowledge to let me perform the analysis I want (although I list some of the best online courses that I found helpful below). Therefore,  I have had to look at other offline courses to supplement my training. One of the courses that I found useful as a basic introduction are those offered by the Oxford uni IT department, the course gives a really good introduction to the basics of R through lecture and practical based learning. Since CGAT fellows are allowed to spend £2000 a year on outside training, I have used some of this money to attend a 5 day bioconductor R course given at Newcastle university to try to improve my skills in this area.

R courses that I found useful: